gh-90815: Add mimalloc memory allocator#109914
Conversation
8b3ec52 to
06e86d7
Compare
September 27, 2023 00:25
e8c4f01 to
2dd8675
Compare
September 27, 2023 15:59
Sorry, something went wrong.
2dd8675 to
672b6f4
Compare
September 27, 2023 17:35
672b6f4 to
41b62cb
Compare
September 27, 2023 17:52
|
@ericsnowcurrently Yep, and much of the autoconf stuff is taken from there... I just finally filled in the description with a little more background on where things are coming from :) |
Sorry, something went wrong.
41b62cb to
0dbe76b
Compare
September 27, 2023 18:19
I expect that most of this PR is a relatively vanilla use of mimalloc in CPython, like @tiran's PR is. Would it be much trouble to keep the whole PR vanilla and have a follow-up PR that applies the customizations to mimalloc (e.g. per-thread allocator state)? That distinction would help, possibly a lot, when reviewing. |
Sorry, something went wrong.
7bda1e4 to
35d1ebd
Compare
September 28, 2023 19:52
|
Okay, the default is now pymalloc, for now |
Sorry, something went wrong.
Any thoughts? I think this got missed earlier in the review flurry. I've not seen any substantive discussion about why bundling is a good idea, which feels like it's worth proper consideration, given it's short-term easier but long-term expensive. I understand upstreaming it initially is going to take some time, but it is going to be worthwhile for cpython upstream maintenance long-term too, as you can avoid:
This also, of course, has the advantage of forcing one to document what the changes are which I think should be done either way. |
Sorry, something went wrong.
|
@DinoV: I pushed a few cleanup changes. I fixed the outdated doc which still said that mimalloc is the default. |
Sorry, something went wrong.
It's worth further discussion. CC @daanx |
Sorry, something went wrong.
|
Congrats @DinoV, I merged your PR :-) Do you want to propose a follow-up PR to enable it by default when |
Sorry, something went wrong.
|
Looks like this broke the WASM buildbot: https://buildbot.python.org/all/#/builders/1046/builds/3371. |
Sorry, something went wrong.
Yep, looks like there's an implicit function definition: |
Sorry, something went wrong.
I wrote PR #111524 to fix WASI build. |
Sorry, something went wrong.
|
I wrote 4 follow-up PRs to fix different issues:
While mimalloc C code is built on Windows, Python cannot currently use mimalloc on Windows: see PR #111528. |
Sorry, something went wrong.
* Add mimalloc v2.12 Modified src/alloc.c to remove include of alloc-override.c and not compile new handler. Did not include the following files: - include/mimalloc-new-delete.h - include/mimalloc-override.h - src/alloc-override-osx.c - src/alloc-override.c - src/static.c - src/region.c mimalloc is thread safe and shares a single heap across all runtimes, therefore finalization and getting global allocated blocks across all runtimes is different. * mimalloc: minimal changes for use in Python: - remove debug spam for freeing large allocations - use same bytes (0xDD) for freed allocations in CPython and mimalloc This is important for the test_capi debug memory tests * Don't export mimalloc symbol in libpython. * Enable mimalloc as Python allocator option. * Add mimalloc MIT license. * Log mimalloc in Lib/test/pythoninfo.py. * Document new mimalloc support. * Use macro defs for exports as done in: python#31164 Co-authored-by: Sam Gross <colesbury@gmail.com> Co-authored-by: Christian Heimes <christian@python.org> Co-authored-by: Victor Stinner <vstinner@python.org>
* Add mimalloc v2.12 Modified src/alloc.c to remove include of alloc-override.c and not compile new handler. Did not include the following files: - include/mimalloc-new-delete.h - include/mimalloc-override.h - src/alloc-override-osx.c - src/alloc-override.c - src/static.c - src/region.c mimalloc is thread safe and shares a single heap across all runtimes, therefore finalization and getting global allocated blocks across all runtimes is different. * mimalloc: minimal changes for use in Python: - remove debug spam for freeing large allocations - use same bytes (0xDD) for freed allocations in CPython and mimalloc This is important for the test_capi debug memory tests * Don't export mimalloc symbol in libpython. * Enable mimalloc as Python allocator option. * Add mimalloc MIT license. * Log mimalloc in Lib/test/pythoninfo.py. * Document new mimalloc support. * Use macro defs for exports as done in: python#31164 Co-authored-by: Sam Gross <colesbury@gmail.com> Co-authored-by: Christian Heimes <christian@python.org> Co-authored-by: Victor Stinner <vstinner@python.org>
* Add mimalloc v2.12 Modified src/alloc.c to remove include of alloc-override.c and not compile new handler. Did not include the following files: - include/mimalloc-new-delete.h - include/mimalloc-override.h - src/alloc-override-osx.c - src/alloc-override.c - src/static.c - src/region.c mimalloc is thread safe and shares a single heap across all runtimes, therefore finalization and getting global allocated blocks across all runtimes is different. * mimalloc: minimal changes for use in Python: - remove debug spam for freeing large allocations - use same bytes (0xDD) for freed allocations in CPython and mimalloc This is important for the test_capi debug memory tests * Don't export mimalloc symbol in libpython. * Enable mimalloc as Python allocator option. * Add mimalloc MIT license. * Log mimalloc in Lib/test/pythoninfo.py. * Document new mimalloc support. * Use macro defs for exports as done in: python#31164 Co-authored-by: Sam Gross <colesbury@gmail.com> Co-authored-by: Christian Heimes <christian@python.org> Co-authored-by: Victor Stinner <vstinner@python.org>
This adds mimalloc as an optional (but preferred when available) allocator to CPython. This is a bit of a mashup of the work from #109914 and the work of @colesbury to use mimalloc for no-gil and various updates to bring it up to current CPython.
The configuration logic added by @tiran is re-used and we keep pymalloc support unlike in the version from @colesbury. mimalloc is updated to 2.12 and along with a few changes @colesbury made to it.
This has run into some issues with subinterpreter support in that the allocator's are now stored in thread state and are per-thread. Sub interpreters in some scenarios will create a thread state on one thread and run that on another thread. Most of these are documented in the code base as being known issues. I've modified these so that we will find the right thread based upon the current thread ID and switch to it rather than getting the head thread. This seems pretty reasonable but looks pretty weird when we need to do it at interpreter shutdown.