amohr: python

Showing posts with label python. Show all posts

Wednesday, May 23, 2018

tracking memory leaks in python

Tracking leaks in python is difficult for a multitude of reasons:

It's GC'd language, which means things don't get freed immediately
It uses pool allocators
the re module has a cache of compiled expressions
tracemalloc may not give you good call stacks: https://bugs.python.org/issue33565
the ThreadPoolExecutor creates a thread per submit until you hit max_workers, default max workers is os.cpu_count() * 5
When using tracemalloc it will consume memory for using the traces
When using modules like request/aiohttp/aiobotocore/etc which use sockets they typically have a pool of connections whose size may fluctuate over time
memory fragmentation

Here are a set of work-arounds around these issues

gc.collect() from a place that isn't holding onto object references when you want a stable point)
from 3.6 forwards use PYTHONMALLOC=malloc
call re._cache.clear() from a similar place to #1
no known work-around (I'm trying tohelp ensure it does something better in the future)
when you start tracemalloc ensure you start after all the threads have been created, this means you've submitted at least max_worker jobs to the pools. Another hack is temporarily changing the ThreadPoolExecutor to create all threads on first submit
Don't rely on RSS when using tracemalloc
Try to make the pool sizes 1
Run your leak tests for longer periods, or if using large chunks of memory try to reduce the chunk sizes

The way I approach it is two-fold:

Try to use tracemalloc to figure out specifically where leaks are coming from, I use a helper like: https://gist.github.com/thehesiod/2f56f98370bea45f021d3704b21707a9
using memory_profiler module to binary search through the codebase to figure out what is causing a leak from a high-level. This basically means disabling parts of your application until you find the trigger.

Wednesday, January 11, 2017

Python not as slow as you may think

I've read a lot of articles about how python is slow and you need to use modules like numpy to speed things up like: http://www.labri.fr/perso/nrougier/from-python-to-numpy/ where it was claimed that numpy was an order of magnitude slower than Python.

I decided to try one of the first examples out on my own, it resulted in the following script:

import random
import numpy as np
from timeit import timeit

Z1 = random.sample(range(1000), 100)
Z2 = random.sample(range(1000), 100)


def add_python():
    return [z1+z2 for (z1, z2) in zip(Z1, Z2)]


def add_numpy():
    return np.add(Z1, Z2)

assert add_python() == add_numpy().tolist()

setup_str = "from __main__ import Z1, Z2, add_python, add_numpy"

print("add_python:", timeit("add_python()", setup_str))
print("add_numpy:", timeit("add_numpy()", setup_str))

The result is that in both python 3.5.2 and 2.7.13 python is in fact faster than numpy :)

Saturday, October 26, 2013

API Designers: Please avoid magic

Just ran into this issue when using the tornado python module as a transparent proxy. It seems that when you call set_status on the HTTPRequest object it tries to map that response code the a default "reason." If there isn't a default value it throws an exception.

This is NOT the right thing to do because it creates a ticking time bomb. I find this out after my server is running for 12 hours and the remote server responded with a 599 status code, causing my server to go into a weird state.

My preference would have been to either have this done explicitly, like setDefaultResponse, or, if a default wasn't found in the dictionary it would go to some pre-canned response.