TITLE OF PAPER: Improving Python's Memory Allocator URL OF PRESENTATION: _URL_of_powerpoint_presentation_ PRESENTED BY: Evan Jones (ejones@uwaterloo.ca) REPRESENTING: University of Waterloo CONFERENCE: PyCon 2005 DATE: March 25, 2005 LOCATION: Marvin Center -------------------------------------------------------------------------- REAL-TIME NOTES / ANNOTATIONS OF THE PAPER: {If you've contributed, add your name, e-mail & URL at the bottom} Outline The problem Inner workings of the memory allocator Solutions Future Finding the problem For an unrelated school project, he had a Python program which generated the simulation scenario. He would pass that to the simulator (not written in Python). Then the Python, which had been blocking all this time, resumed and analyzed the data. The trouble is that Python would never let go of any RAM it used for small objects, so it would starve the simulator. Workarounds If you have to do something memory intensive, fork and kill the computing process when you're done. Store temp results in the filesystem. But these are kludgy. Current allocator Pycalloc: same since 2.3 It mallocs memory in 256kB chunks ("arenas") which it carves up later as needed; everything goes on the heap. It's used only for objects <= 256b in size. Pool (4kB) | Header | Pool (4kB) | Arena (256kB) | Padding | | Block | | Block | | Block | | [something] | Python does reuse memory, but it never returns it to the OS. This happens because we don't keep track of what arenas the blocks are part of, so we can't detect when all the blocks of an arena are free and therefore can't free the arena. Evaluation This solution will fail if your arenas get fragmented, of course. To solve this, you'd have to move to a compacting GC like the JVM or .NET CLR. This would break all the current C extensions (and be really slow). (Bad allocation/deallocation patterns will kill you in this system...however, it won't be any worse than the current system, it just won't be better.) Q&A Q: Can this new allocator be optional or a command-line option? A: Maybe, but there's no need to do so. Q: Can we group little allocations into arenas by what objects the little allocations are a part of, since they'll probably be disposed together? A: Remember that the current bad allocator is used only for small objects. Large ones do get freed eventually, and user-defined objects are large objects. (Is this true?) Q: Can the allocator be tuned to match the paging size that the system is using? A: If your OS uses 4K pages, it will do the right thing. Q: Will this be in 2.5? A: Uncle Tim says "almost certainly." -------------------------------------------------------------------------- REFERENCES: {as documents / sites are referenced add them below} http://evanjones.ca/ -------------------------------------------------------------------------- QUOTES: -------------------------------------------------------------------------- CONTRIBUTORS: {add your name, e-mail address and URL below} Erik Rose -------------------------------------------------------------------------- E-MAIL BOUNCEBACK: {add your e-mail address separated by commas } -------------------------------------------------------------------------- NOTES ON / KEY TO THIS TEMPLATE: A headline (like a field in a database) will be CAPITALISED This differentiates from the text that follows A variable that you can change will be surrounded by _underscores_ Spaces in variables are also replaced with under_scores This allows people to select the whole variable with a simple double-click A tool-tip is lower case and surrounded by {curly brackets / parentheses} These supply helpful contextual information. -------------------------------------------------------------------------- Copyright shared between all the participants unless otherwise stated...