Improving Python's Memory Allocator.notes

Saturday, March 26, 2005

TITLE OF PAPER: Improving Python's Memory Allocator
URL OF PRESENTATION: _URL_of_powerpoint_presentation_
PRESENTED BY: Evan Jones (
REPRESENTING: University of Waterloo

DATE: March 25, 2005
LOCATION: Marvin Center

{If you've contributed, add your name, e-mail & URL at the bottom}

    The problem
    Inner workings of the memory allocator

Finding the problem
    For an unrelated school project, he had a Python program which generated the
        simulation scenario. He would pass that to the simulator (not written in
        Python). Then the Python, which had been blocking all this time, resumed and
        analyzed the data.
    The trouble is that Python would never let go of any RAM it used for small
        objects, so it would starve the simulator.
    If you have to do something memory intensive, fork and kill the computing process
        when you're done.
    Store temp results in the filesystem.
    But these are kludgy.

Current allocator
    Pycalloc: same since 2.3
        It mallocs memory in 256kB chunks ("arenas") which it carves up later as
            needed; everything goes on the heap.
        It's used only for objects <= 256b in size.
        Pool (4kB)  |  Header         |  Pool (4kB)  |  Arena (256kB)
                    |   Padding       |
                    |   Block         |
                    |   Block         |
                    |   Block         |
                    |   [something]   |
        Python does reuse memory, but it never returns it to the OS.
        This happens because we don't keep track of what
            arenas the blocks are part of, so we can't detect when all the blocks of
            an arena are free and therefore can't free the arena.
        <read Modules/obmalloc.c for more pretty graphics from tim, and more
            discussion of how the allocator works>

    This solution will fail if your arenas get fragmented, of course. To solve this,
        you'd have to move to a compacting GC like the JVM or .NET CLR. This would
        break all the current C extensions (and be really slow).
    (Bad allocation/deallocation patterns will kill you in this system...however, it
        won't be any worse than the current system, it just won't be better.)

    Q: Can this new allocator be optional or a command-line option?
    A: Maybe, but there's no need to do so.

    Q: Can we group little allocations into arenas by what objects the little
        allocations are a part of, since they'll probably be disposed together?
    A: Remember that the current bad allocator is used only for small objects. Large
        ones do get freed eventually, and user-defined objects are large objects. (Is
        this true?)

    Q: Can the allocator be tuned to match the paging size that the system is using?
    A: If your OS uses 4K pages, it will do the right thing.

    Q: Will this be in 2.5?
    A: Uncle Tim says "almost certainly."

REFERENCES: {as documents / sites are referenced add them below}


CONTRIBUTORS: {add your name, e-mail address and URL below}
Erik Rose <>

E-MAIL BOUNCEBACK: {add your e-mail address separated by commas }

A headline (like a field in a database) will be CAPITALISED
    This differentiates from the text that follows
A variable that you can change will be surrounded by _underscores_
    Spaces in variables are also replaced with under_scores
    This allows people to select the whole variable with a simple double-click
A tool-tip is lower case and surrounded by {curly brackets / parentheses}
    These supply helpful contextual information.

Copyright shared between all the participants unless otherwise stated...