◐ Shell
clean mode source ↗

gh-120754: Fix memory leak in FileIO.__init__() by vstinner · Pull Request #124225 · python/cpython

@vstinner

@vstinner vstinner commented

Sep 18, 2024

edited by bedevere-app Bot

Loading

Copy link Copy Markdown

Member

Free 'self->stat_atopen' before assigning it, since io.FileIO.init() can be called multiple times manually (especially by test_io).

Free 'self->stat_atopen' before assigning it, since
io.FileIO.__init__() can be called multiple times manually
(especially by test_io).

@vstinner

Copy link Copy Markdown

Member Author

@cmaloney

Copy link Copy Markdown

Contributor

👍 Definitely where the leak is happening / my own tracking finds same location. https://vstinner.github.io/debug-python-refleak.html really helpful in tracking down, thanks for writing up.

There's a number of cases where self->fd is changed but the cached information about the file (now stat_atopen, previously estimated_size and _blksize) aren't changed. I started tracking those down (https://github.com/python/cpython/compare/main...cmaloney:cpython:cmaloney/fix_closefd_norealloc?expand=1). Happy with a much more minimal fix to get bots green. Trying to find a refactor so all the closefd logic which feels fairly replicated across the code at the moment can hopefully get deduped a lot while fixing potential corner case issues.

vstinner reacted with thumbs up emoji

@vstinner vstinner merged commit 43cd7aa into python:main

Sep 18, 2024

@vstinner vstinner deleted the fileio_init branch

September 18, 2024 22:11

@vstinner

Copy link Copy Markdown

Member Author

(https://github.com/python/cpython/compare/main...cmaloney:cpython:cmaloney/fix_closefd_norealloc?expand=1)

Only allocating (PyMem_New) stat_atopen if it's NULL sounds like a better fix than my fix. I merged my fix anyway since you approved it and I would like to repair buildbot as soon as possible.

You might include your better fix in a following PR if you want, since you planned more changes for io if I understood correctly.

Thanks for reviewing my fix!

cmaloney reacted with thumbs up emoji

@cmaloney

cmaloney commented

Sep 18, 2024

edited

Loading

Copy link Copy Markdown

Contributor

Yep, will make more PRs stat_atopen got several patches moving again.

  1. isatty at open to replace GH-120754: Remove isatty call during regular open #121593 and fix Avoid calling isatty() for most open() calls #90102
  2. "clear out cached stat result anytime self->fd is changed" (followup from this PR)
  3. Potentially clear out stat_atopen on seek Unbounded reads by zipfile may cause a MemoryError. #113977. That is a potentially more comprehensive solution to code that does file.seek() then file.read(), but also in theory the "check position in file if file is large" code should make unnecessary, https://github.com/python/cpython/blob/main/Modules/_io/fileio.c#L769-L783. Still trying to wrap my head around why the "get position" didn't seem to work for that issue)
vstinner reacted with thumbs up emoji

savannahostrowski pushed a commit to savannahostrowski/cpython that referenced this pull request

Sep 22, 2024
Free 'self->stat_atopen' before assigning it, since
io.FileIO.__init__() can be called multiple times manually
(especially by test_io).