Friday, January 04, 2019

Removing old Injector Seats from Audi A4

I have a 2001 Audi A4 1.8T.  I'm in the process of overhauling the injector an PCV systems as the injectors were leaking, and the PCV system is leaking all over the place after > 220,000 miles. 

There are several articles talking about how to remove these, and there are strategies like removing them when the engine is hot, using ethanol, acetone, and using the right hex tool (2001 is 20mm), however these strategies fail spectacularly at this age as the cups just disintegrate because the plastic has gone brittle. So don't waste your time trying to find a 20mm tool at that point. It doesn't matter what tool you use they're nearly impossible to remove cleanly at that age.

The strategy I've come up over the last few hours that seems to work is chiseling out the cap on the top, and the tail at the bottom, and then grinding out the bulk in the middle with round wood sand paper moto tool attachment, and then using a round steel brush moto attachment to grind out the plastic bits from the threads. There's a strategy to the steel brush tool, you basically use speed 4 or so, and then push straight down into the thread until you can see the metal, and then move to a new spot. After you can see the threads, can can press down a little more to grind a bit into the threads, the threads sound be ok with this. Then you start picking away with a precision flat head screwdriver. If you're lucky, the heat from the metal brush detaches the plastic from the epoxy so you can carefully peel away a few rounds of the plastic from the threads. After all the plastic comes you, the really time intensive part starts, that's picking out the epoxy that's glued to the threads. Here you have to use the precision flat head screw driver and leverage against the edge of the cup and start scraping following the threads. Eventually you'll figure out the amount of pressure you need to remove the bulk.

After it looks good, you'll put the new cups in and find out that it sticks in certain areas, and now comes round 2 of the scraping with the screw drivers.

HOURS of fun. After one night I'm half way done <sigh>. After all this work I'm chucking my new plastic cups and just ordered a set of billet cups. NEVER AGAIN.

Tuesday, October 02, 2018

Review of Kirkland Signature Rotisserie Chicken Noodle Soup

In two words, very good. Another two, highly recommended :)  A perfect pairing is with the Kirkland Potato Chive Focaccia, yum.

Wednesday, May 23, 2018

tracking memory leaks in python

Tracking leaks in python is difficult for a multitude of reasons:

  1. It's GC'd language, which means things don't get freed immediately
  2. It uses pool allocators
  3. the re module has a cache of compiled expressions
  4. tracemalloc may not give you good call stacks: https://bugs.python.org/issue33565
  5. the ThreadPoolExecutor creates a thread per submit until you hit max_workers, default max workers is os.cpu_count() * 5
  6. When using tracemalloc it will consume memory for using the traces
  7. When using modules like request/aiohttp/aiobotocore/etc which use sockets they typically have a pool of connections whose size may fluctuate over time
  8. memory fragmentation

Here are a set of work-arounds around these issues
  1. gc.collect() from a place that isn't holding onto object references when you want a stable point)
  2. from 3.6 forwards use PYTHONMALLOC=malloc
  3. call re._cache.clear() from a similar place to #1
  4. no known work-around (I'm trying tohelp ensure it does something better in the future)
  5. when you start tracemalloc ensure you start after all the threads have been created, this means you've submitted at least max_worker jobs to the pools.  Another hack is temporarily changing the ThreadPoolExecutor to create all threads on first submit
  6. Don't rely on RSS when using tracemalloc
  7. Try to make the pool sizes 1
  8. Run your leak tests for longer periods, or if using large chunks of memory try to reduce the chunk sizes

The way I approach it is two-fold:
  1. Try to use tracemalloc to figure out specifically where leaks are coming from, I use a helper like: https://gist.github.com/thehesiod/2f56f98370bea45f021d3704b21707a9
  2. using memory_profiler module to binary search through the codebase to figure out what is causing a leak from a high-level.  This basically means disabling parts of your application until you find the trigger.


Saturday, February 03, 2018

Epson Document Capture Pro Crash Recovery

Today I was scanning about a thousand pages of documents with Document Capture Pro and my Epson DS-510. I had already scanned several hundred pages without issue and was on the last batch of 200 so as I hit save I decoded to let me wife start shredding what I had scanned...and sure enough about half way through saving it crashed....nooooooo!

I looked in the target folder and there was no trace of file...I was hoping that it would have written out half the file but no luck. My last hope was that there was a temporary file with some of the document somewhere.  So I fired up process monitor and tried saving a sample scan and saw it was writing to AppData\Local\Temp.  So I start looking around and low and behold there's a EpsonScanIO folder with all the lost images!

phew!

Hope this tip saves people some time!

Tuesday, October 03, 2017

mocking AWS Lambdas

When dealing with AWS services it can be tedious mocking the endpoints.  Luckily there's a wonderful module called moto which takes care of this for you.  moto supports a majority of the AWS backends in various degrees of completeness. Recently I overhauled the lambda backend and added support for running lambdas in the environment they're specified to run in (JS, python, etc) as well as linking to SNS events and cloudwatch logging.

In my use-case I had:
  1. My test-cases running on bare-metal, including:
    1. Mocking sns, lambda, s3, kms, logs, and cloudwatch endpoints via moto.
    2. Registering an AWS lambda that connects to a mocked SNS endpoint
    3. Mocked google endpoints via custom aiohttp server
  2. docker container which forwarded messages from a Google PubSub endpoint (mocked via PubSub Emulator) to the SNS moto mocked endpoint which triggered the mocked lambda.  
  3. Another container that registered subscriptions from mocked google services to PubSub endpoint, along with occasionally triggering lambda via SNS endpoint

Each moto mock endpoint was created via my helper class:

class MotoService:
    """ Will Create MotoService.
    
    Service is ref-counted so there will only be one per process. Real Service will
    be returned by `__aenter__`."""

    _services: Dict[str, Any] = dict()  # {name: instance}

    def __init__(self, service_name: str, port: int=None):
        self._service_name = service_name

        if port:
            self._socket = None
            self._port = port
        else:
            self._socket, self._port = get_free_tcp_port()

        self._thread = None
        self._logger = logging.getLogger('MotoService')
        self._refcount = None
        self._ip_address = get_ip_address()

    @property
    def endpoint_url(self):
        return 'http://{}:{}'.format(self._ip_address, self._port)

    def __call__(self, func):
        async def wrapper(*args, **kwargs):
            await self._start()
            try:
                result = await func(*args, **kwargs)
            finally:
                await self._stop()
            return result

        functools.update_wrapper(wrapper, func)
        wrapper.__wrapped__ = func
        return wrapper

    async def __aenter__(self):
        svc = self._services.get(self._service_name)
        if svc is None:
            self._services[self._service_name] = self
            self._refcount = 1
            await self._start()
            return self
        else:
            svc._refcount += 1
            return svc

    async def __aexit__(self, exc_type, exc_val, exc_tb):
        self._refcount -= 1

        if self._socket:
            self._socket.close()
            self._socket = None

        if self._refcount == 0:
            del self._services[self._service_name]
            await self._stop()

    @staticmethod
    def _shutdown():
        req = flask.request
        shutdown = req.environ['werkzeug.server.shutdown']
        shutdown()
        return flask.make_response('done', 200)

    def _create_backend_app(self, *args, **kwargs):
        backend_app = moto.server.create_backend_app(*args, **kwargs)
        backend_app.add_url_rule('/shutdown', 'shutdown', self._shutdown)
        return backend_app

    def _server_entry(self):
        self._main_app = moto.server.DomainDispatcherApplication(self._create_backend_app, service=self._service_name)
        self._main_app.debug = True

        if self._socket:
            self._socket.close()  # release right before we use it
            self._socket = None

        moto.server.run_simple(self._ip_address, self._port, self._main_app, threaded=True)

    async def _start(self):
        self._thread = threading.Thread(target=self._server_entry, daemon=True)
        self._thread.start()

        async with aiohttp.ClientSession() as session:
            for i in range(0, 10):
                if not self._thread.is_alive():
                    break

                try:
                    # we need to bypass the proxies due to monkeypatches
                    async with session.get(self.endpoint_url + '/static/', timeout=0.5):
                        pass
                    break
                except (asyncio.TimeoutError, aiohttp.ClientConnectionError):
                    await asyncio.sleep(0.5)
            else:
                await self._stop()  # pytest.fail doesn't call stop_process
                raise Exception("Can not start service: {}".format(self._service_name))

    async def _stop(self):
        try:
            async with aiohttp.ClientSession() as session:
                async with session.get(self.endpoint_url + '/shutdown', timeout=5):
                    pass
        except:
            self._logger.exception("Error stopping moto service")
            raise
        finally:
            self._thread.join()

My setUpClass looked something like the following:


@classmethod
    def setUpClass(cls):
        cls._pubsub_port = get_free_tcp_port(True)
        cls._gcloud_enumlator = subprocess.Popen(["gcloud", "beta", "emulators", "pubsub", "start", "--host-port={}:{}".format(IP_ADDRESS, cls._pubsub_port)], preexec_fn=os.setsid)

        boto_service_names = {'sns', 'lambda', 's3', 'kms', 'logs', 'cloudwatch'}
        cls._boto_svcs = {}

        async def start_svc(svc_name):
            cls._boto_svcs[svc_name] = await MotoService(svc_name).__aenter__()

        try:
            loop = asyncio.get_event_loop()
            loop.run_until_complete(asyncio.gather(*[start_svc(svc_name) for svc_name in boto_service_names]))

            cls._mock_env_vars = {'{}_mock_endpoint_url'.format(name): svc.endpoint_url + '/' for name, svc in cls._boto_svcs.items()}
            cls._mock_env_vars['PUBSUB_EMULATOR_HOST'] = '{}:{}'.format(IP_ADDRESS, cls._pubsub_port)
            cls._mock_env_vars['AWS_DEFAULT_REGION'] = AWS_DEFAULT_REGION

            for name, value in cls._mock_env_vars.items():
                os.environ[name] = value

            session = botocore.session.get_session()
            cls._boto_clients = {svc_name: session.create_client(svc_name) for svc_name in boto_service_names}
        except:
            cls.tearDownClass()
            raise

After which things like S3/KMS were set up.  One of the more interesting ones was the lambda function which connected to SNS which looked like this:


with open(os.path.join(CURRENT_DIR, '..', 'lambda_image', 'lambda_labeler_image.zip'), 'rb') as zip_file:
            lambda_response = self._boto_clients['lambda'].create_function(
                FunctionName=LAMBDA_FUNCTION_NAME, Runtime='python3.6',
                Role='test-iam-role', Handler='lambda_function.lambda_handler',
                Timeout=15, MemorySize=128, Publish=True,
                Code={'ZipFile': zip_file.read()},
                Environment={
                    'Variables': {**mock_env_vars, 'LOG_LEVEL': str(logging.DEBUG), 'UNITTEST': 'true'}
                })

        # now subscribe lambda function to SNS topic
        self._boto_clients['sns'].subscribe(TopicArn=self._sns_topic_arn, Protocol='lambda', Endpoint=lambda_response['FunctionArn'])

Note I forward the *_mock_endpoint_urls via environment variables.

I linked all the containers via a docker compose file that had something like the following:


environment:
      - AWS_ACCESS_KEY_ID=dummy
      - AWS_SECRET_ACCESS_KEY=dummy
      - sns_mock_endpoint_url=${sns_mock_endpoint_url}
      - lambda_mock_endpoint_url=${lambda_mock_endpoint_url}
      - s3_mock_endpoint_url=${s3_mock_endpoint_url}
      - kms_mock_endpoint_url=${kms_mock_endpoint_url}
      - logs_mock_endpoint_url=${logs_mock_endpoint_url}
      - google_mock_endpoint_url=${google_mock_endpoint_url}
      - PUBSUB_EMULATOR_HOST=${PUBSUB_EMULATOR_HOST}
      - AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION}


Now each container (including the lambda) ran the following to enable API call forwarding to the mocked endpoint:


def _wrapt_boto_create_client(wrapped, instance, args, kwargs):
    def unwrap_args(service_name, region_name=None, api_version=None,
                    use_ssl=True, verify=None, endpoint_url=None,
                    aws_access_key_id=None, aws_secret_access_key=None,
                    aws_session_token=None, config=None):

        if endpoint_url is None:
            endpoint_url = os.environ.get('{}_mock_endpoint_url'.format(service_name))

        return wrapped(service_name, region_name, api_version, use_ssl, verify,
                       endpoint_url, aws_access_key_id, aws_secret_access_key,
                       aws_session_token, config)

    return unwrap_args(*args, **kwargs)


def patch_boto():
    """
    Will patch botocore to set endpoint_url to: {SERVICE_NAME}_endpoint_url if
    available
    """
    wrapt.wrap_function_wrapper(
        'botocore.session',
        'Session.create_client',
        _wrapt_boto_create_client
    )


_redir_prefix = {
    'https://www.googleapis.com/',
    'https://accounts.google.com/',
    'https://people.googleapis.com/'
}


def _replace_url_prefix(url: str, redir_endpoint: str):
    if not redir_endpoint:
        return url

    if url.startswith(redir_endpoint):
        return url

    for prefix in _redir_prefix:
        if url.startswith(prefix):
            url = url.replace(prefix, redir_endpoint)
            return url

    assert False


def _wrapped_discovery_resource_init(wrapped, instance, args, kwargs):
    redir_endpoint = os.environ.get('google_mock_endpoint_url')

    def unwrap_args(http, baseUrl, model, requestBuilder, developerKey,
               resourceDesc, rootDesc, schema):

        baseUrl = _replace_url_prefix(baseUrl, redir_endpoint)

        return wrapped(http, baseUrl, model, requestBuilder, developerKey,
               resourceDesc, rootDesc, schema)

    return unwrap_args(*args, **kwargs)


def _wrapped_oath2_credentials_init(wrapped, instance, args, kwargs):
    redir_endpoint = os.environ.get('google_mock_endpoint_url')

    def unwrap_args(access_token, client_id, client_secret, refresh_token,
                 token_expiry, token_uri, user_agent, revoke_uri=None,
                 id_token=None, token_response=None, scopes=None,
                 token_info_uri=None):

        revoke_uri = _replace_url_prefix(revoke_uri, redir_endpoint)
        token_uri = _replace_url_prefix(token_uri, redir_endpoint)

        return wrapped(access_token, client_id, client_secret, refresh_token, token_expiry,
                       token_uri, user_agent, revoke_uri, id_token, token_response, scopes, token_info_uri)

    return unwrap_args(*args, **kwargs)


def patch_google_client():
    wrapt.wrap_function_wrapper(
        'googleapiclient.discovery',
        'Resource.__init__',
        _wrapped_discovery_resource_init
    )

    wrapt.wrap_function_wrapper(
        'oauth2client.client',
        'OAuth2Credentials.__init__',
        _wrapped_oath2_credentials_init
    )


# Run in start-up code
test_mock_endpoints = {name: value for name, value in os.environ.items() if name.endswith("_mock_endpoint_url")}
if test_mock_endpoints:
    patch_boto()

test_google_endpoint_url = os.environ.get('google_mock_endpoint_url')
if test_google_endpoint_url:
    patch_google_client()

Along with several helpers, my test-case would look something like this:


user_svc = self._google_svcs.dir_svc.users[ADMIN_EMAIL]

        # personal address
        headers = {
            'Date': email.utils.format_datetime(datetime.utcnow().replace(tzinfo=timezone.utc), True),
            'From': "{}".format(personal_email_addr),
            'To': '{} <{}>'.format(user_svc.user_obj['primaryEmail'], user_svc.user_obj['name']['fullName']),
            'Subject': "Dummy",
        }

        archive_msg_obj = user_svc.add_message([ARCHIVE_LABEL_NAME, 'FOLDER'], headers, 'dummy message')

        await self._wait_for_log_entry("Finished processing messages for user: {}".format(ADMIN_EMAIL))
        for msg in user_svc.messages.values():
            lbl_names = user_svc.get_lbl_names(msg['labelIds'])
            self.assertEqual(set(lbl_names), {EXPECTED_LABEL_NAME, 'FOLDER'})



This way each container and lambda invocation will forward AWS + Google client API calls to the mocked endpoints.  The end result is that by hitting a mocked google endpoint, a message would get published to the mocked google PubSub endpoint, triggering the message to get forwarded to a moto mocked SNS endpoint, triggering a lambda, which would log to cloudwatch, and be picked up by my test-case.  If others are interested in the google endpoint mock I can amend this post with that information as well.  All in all I'm really happy with the workflow as it duplicates running in production almost exactly with minimal changes to the production code, helping ensure no issues pop up when pushed to production.

Wednesday, January 11, 2017

Python not as slow as you may think

I've read a lot of articles about how python is slow and you need to use modules like numpy to speed things up like: http://www.labri.fr/perso/nrougier/from-python-to-numpy/ where it was claimed that numpy was an order of magnitude slower than Python.

I decided to try one of the first examples out on my own, it resulted in the following script:

import random
import numpy as np
from timeit import timeit

Z1 = random.sample(range(1000), 100)
Z2 = random.sample(range(1000), 100)


def add_python():
    return [z1+z2 for (z1, z2) in zip(Z1, Z2)]


def add_numpy():
    return np.add(Z1, Z2)

assert add_python() == add_numpy().tolist()

setup_str = "from __main__ import Z1, Z2, add_python, add_numpy"

print("add_python:", timeit("add_python()", setup_str))
print("add_numpy:", timeit("add_numpy()", setup_str))


The result is that in both python 3.5.2 and 2.7.13 python is in fact faster than numpy :)

Saturday, September 10, 2016

How to Make a Great Homemade Pulled Pork Sandwich without the Fuss

We just put together a really tasty pulled pork sandwich so I want to both share it with everyone and record it so we can make it again! :)

We started by buying a big package of pork from Costco since it's cheap, freezes well, and is easy to make. So we bought a 4 pack of Pork Sirloin Tip Roast from costo, ended up being a little over 8 pounds. We immediately put it in the freezer.

A couple days when we were ready to have some pulled pork we put contents of one frozen package of the pork in a slow cooker for ~20hrs following this recipe: http://allrecipes.com/recipe/24035/kalua-pig-in-a-slow-cooker

After it was done we put it in the fridge.

When we were ready for the sandwich we went to whole foods and bought a package of their pre-cut slaw packages from the refrigerated pre-cut section, and a few items we were missing from this slaw recipe: http://www.foodnetwork.com/recipes/robert-irvine/cole-slaw-recipe0.html, except we changed the white wine vinegar with apple cider vinegar. We bought what turned out to be an awesome BBQ Sauce: Mild Everett & Jones Super-Q sauce, which happens to be made here locally in the bay area. We also bought a wonderful sandwich bread: Beckmann's Francese Deli Roll.

We then mixed the ingredients from the recipe for the slaw sauce and mixed it with the cut slaw (5 minutes), and lightly toasted the bread.

All what was left was mounting the sandwich and enjoying a wonderful meal! Personally we finished this with some carrot juice, but if you have some sweet tea I think it would have made it perfect.
d-lish ;)