- It's GC'd language, which means things don't get freed immediately
- It uses pool allocators
- the re module has a cache of compiled expressions
- tracemalloc may not give you good call stacks: https://bugs.python.org/issue33565
- the ThreadPoolExecutor creates a thread per submit until you hit max_workers, default max workers is os.cpu_count() * 5
- When using tracemalloc it will consume memory for using the traces
- When using modules like request/aiohttp/aiobotocore/etc which use sockets they typically have a pool of connections whose size may fluctuate over time
- memory fragmentation
Here are a set of work-arounds around these issues
- gc.collect() from a place that isn't holding onto object references when you want a stable point)
- from 3.6 forwards use PYTHONMALLOC=malloc
- call re._cache.clear() from a similar place to #1
- no known work-around (I'm trying tohelp ensure it does something better in the future)
- when you start tracemalloc ensure you start after all the threads have been created, this means you've submitted at least max_worker jobs to the pools. Another hack is temporarily changing the ThreadPoolExecutor to create all threads on first submit
- Don't rely on RSS when using tracemalloc
- Try to make the pool sizes 1
- Run your leak tests for longer periods, or if using large chunks of memory try to reduce the chunk sizes
The way I approach it is two-fold:
- Try to use tracemalloc to figure out specifically where leaks are coming from, I use a helper like: https://gist.github.com/thehesiod/2f56f98370bea45f021d3704b21707a9
- using memory_profiler module to binary search through the codebase to figure out what is causing a leak from a high-level. This basically means disabling parts of your application until you find the trigger.
No comments:
Post a Comment