gh-120754: Fix memory leak in FileIO.__init__() by vstinner · Pull Request #124225 · python/cpython
Member
Free 'self->stat_atopen' before assigning it, since io.FileIO.init() can be called multiple times manually (especially by test_io).
Free 'self->stat_atopen' before assigning it, since io.FileIO.__init__() can be called multiple times manually (especially by test_io).
vstinner
added
the
skip news
label
bedevere-app
Bot
added
the
awaiting core review
label
bedevere-app
Bot
mentioned this pull request
vstinner
mentioned this pull request
vstinner
commented
Sep 18, 2024
vstinner commented
Member Author
cc @cmaloney
cmaloney
commented
Sep 18, 2024
cmaloney commented
Contributor
👍 Definitely where the leak is happening / my own tracking finds same location. https://vstinner.github.io/debug-python-refleak.html really helpful in tracking down, thanks for writing up.
There's a number of cases where self->fd is changed but the cached information about the file (now stat_atopen, previously estimated_size and _blksize) aren't changed. I started tracking those down (https://github.com/python/cpython/compare/main...cmaloney:cpython:cmaloney/fix_closefd_norealloc?expand=1). Happy with a much more minimal fix to get bots green. Trying to find a refactor so all the closefd logic which feels fairly replicated across the code at the moment can hopefully get deduped a lot while fixing potential corner case issues.
cmaloney
approved these changes
vstinner
merged commit
43cd7aa
into
python:main
vstinner
deleted the
fileio_init
branch
bedevere-app
Bot
removed
the
awaiting core review
label
vstinner
commented
Sep 18, 2024
vstinner commented
Member Author
Only allocating (PyMem_New) stat_atopen if it's NULL sounds like a better fix than my fix. I merged my fix anyway since you approved it and I would like to repair buildbot as soon as possible.
You might include your better fix in a following PR if you want, since you planned more changes for io if I understood correctly.
Thanks for reviewing my fix!
Contributor
Yep, will make more PRs stat_atopen got several patches moving again.
isattyat open to replace GH-120754: Remove isatty call during regular open #121593 and fix Avoid calling isatty() for most open() calls #90102- "clear out cached stat result anytime self->fd is changed" (followup from this PR)
- Potentially clear out stat_atopen on seek Unbounded reads by
zipfilemay cause aMemoryError. #113977. That is a potentially more comprehensive solution to code that doesfile.seek()thenfile.read(), but also in theory the "check position in file if file is large" code should make unnecessary, https://github.com/python/cpython/blob/main/Modules/_io/fileio.c#L769-L783. Still trying to wrap my head around why the "get position" didn't seem to work for that issue)
savannahostrowski pushed a commit to savannahostrowski/cpython that referenced this pull request
Free 'self->stat_atopen' before assigning it, since io.FileIO.__init__() can be called multiple times manually (especially by test_io).
cmaloney
mentioned this pull request