You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 12, 2022. It is now read-only.
In our app, we make use of pull task queues, and we pull large number of tasks with large bodies (e.g. ~5 MB of payloads). I noticed that it appeared we had a memory leak when running this in the python-compat runtime, with many megabytes of strings being retained. With a whole lot of hacking, I ended up tracking it down to the following:
Multiprocessing pool workers hold on to the result objects in their local variables until they get another work item.
The pool has 100 threads in the python-compat environment. This means it can retain up to ~100 items.
The requests.Response is the result type that gets cached, and it stores the body of the response.
My hack to fix it: in google/appengine/ext/vmruntime/vmstub.py, I set response._content = None right after the response protocol buffer message is parsed. This caused our process to use ~100 MB of memory, instead of ~400 MB of memory.
Hacky output from my tool that found the large strings that were being retained, showing what is holding on to it:
found new string! len=4653338
referred by 139691536297608 <type 'dict'>
-referred by 139691536340816 <class 'requests_nologs.models.Response'>
--referred by 139691536378120 <type 'tuple'>
---referred by frame /usr/lib/python2.7/multiprocessing/pool.py 102 worker
---referred by frame /usr/lib/python2.7/multiprocessing/pool.py 380 _handle_results