◐ Shell
reader mode source ↗
Skip to content

bpo-46070: _PyGC_Fini() untracks objects#30577

Merged
vstinner merged 1 commit into
python:mainfrom
vstinner:gc_fini_untrack
Jan 13, 2022
Merged

bpo-46070: _PyGC_Fini() untracks objects#30577
vstinner merged 1 commit into
python:mainfrom
vstinner:gc_fini_untrack

Conversation

@vstinner

@vstinner vstinner commented Jan 13, 2022

Copy link
Copy Markdown
Member

Py_EndInterpreter() now explicitly untracks all objects currently
tracked by the GC. Previously, if an object was used later by another
interpreter, calling PyObject_GC_UnTrack() on the object crashed if
the previous or the next object of the PyGC_Head structure became a
dangling pointer.

https://bugs.python.org/issue46070

Py_EndInterpreter() now explicitly untracks all objects currently
tracked by the GC. Previously, if an object was used later by another
interpreter, calling PyObject_GC_UnTrack() on the object crashed if
the previous or the next object of the PyGC_Head structure became a
dangling pointer.
@vstinner

vstinner commented Jan 13, 2022

Copy link
Copy Markdown
Member Author

@erlend-aasland

Copy link
Copy Markdown
Contributor

I do not know the GC, so I'm not the right person to review this change. It sounds good to me, though, FWIW 😄

@vstinner vstinner merged commit 1a4d1c1 into python:main Jan 13, 2022
@miss-islington

Copy link
Copy Markdown
Contributor

Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9, 3.10.
🐍🍒⛏🤖

@vstinner vstinner deleted the gc_fini_untrack branch January 13, 2022 18:28
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jan 13, 2022
Py_EndInterpreter() now explicitly untracks all objects currently
tracked by the GC. Previously, if an object was used later by another
interpreter, calling PyObject_GC_UnTrack() on the object crashed if
the previous or the next object of the PyGC_Head structure became a
dangling pointer.
(cherry picked from commit 1a4d1c1)

Co-authored-by: Victor Stinner <vstinner@python.org>
@bedevere-bot bedevere-bot removed the needs backport to 3.10 only security fixes label Jan 13, 2022
@bedevere-bot

Copy link
Copy Markdown

GH-30578 is a backport of this pull request to the 3.10 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jan 13, 2022
Py_EndInterpreter() now explicitly untracks all objects currently
tracked by the GC. Previously, if an object was used later by another
interpreter, calling PyObject_GC_UnTrack() on the object crashed if
the previous or the next object of the PyGC_Head structure became a
dangling pointer.
(cherry picked from commit 1a4d1c1)

Co-authored-by: Victor Stinner <vstinner@python.org>
@bedevere-bot

Copy link
Copy Markdown

GH-30579 is a backport of this pull request to the 3.9 branch.

@ericsnowcurrently ericsnowcurrently left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

The approach seems good to me, but I'd want to hear what @pablogsal, @pitrou, @iritkatriel, etc. have to say.

miss-islington added a commit that referenced this pull request Jan 13, 2022
Py_EndInterpreter() now explicitly untracks all objects currently
tracked by the GC. Previously, if an object was used later by another
interpreter, calling PyObject_GC_UnTrack() on the object crashed if
the previous or the next object of the PyGC_Head structure became a
dangling pointer.
(cherry picked from commit 1a4d1c1)

Co-authored-by: Victor Stinner <vstinner@python.org>
vstinner added a commit that referenced this pull request Jan 13, 2022
Py_EndInterpreter() now explicitly untracks all objects currently
tracked by the GC. Previously, if an object was used later by another
interpreter, calling PyObject_GC_UnTrack() on the object crashed if
the previous or the next object of the PyGC_Head structure became a
dangling pointer.

(cherry picked from commit 1a4d1c1)
@pablogsal

Copy link
Copy Markdown
Member

This approach looks sensible but another solution to consider (it may have problems) is to move all these objects into the main interpreter GC state. This way, more reference cycles can be broken there and the memory can be reclaimed when the main interpreter does a GC pass.

Thoughts on this approach?

@vstinner

Copy link
Copy Markdown
Member Author

This approach looks sensible but another solution to consider (it may have problems) is to move all these objects into the main interpreter GC state. This way, more reference cycles can be broken there and the memory can be reclaimed when the main interpreter does a GC pass. Thoughts on this approach?

For a simple object, it should be fine. But I don't understand how it would work for more complex structures with finalizers. Which interpreter is supposed to call it? Which interpreter "owns" the object? There is the "strong reference" ownership but the GC list ownership.

IMO we would avoid many not-fun-at-all problems by preventing objects to travel between interpreters ;-)

@pitrou

pitrou commented Jan 15, 2022

Copy link
Copy Markdown
Member

I agree with @vstinner. Another related issue is: if objects can travel between interpreters, how would you make the object allocator per-interpreter (if you want to have a per-interpreter GIL)?

@pablogsal

Copy link
Copy Markdown
Member

I am convinced by these arguments 👍

@vstinner

vstinner commented Jan 16, 2022

Copy link
Copy Markdown
Member Author

I understand the motivation to "clean up everything" by breaking the reference cycles. But for me, the problem is way bigger than that. If you have complex objects with finalizers running complex code, the interpreter in which they are executed matters. For example, if it's a database connection. You may care in which interpreter it's run.

It reminds me a fork() problem when an application creates many objects before calling fork(). The consistency of a database connection, of a socket or a file becomes way harder to control if two processes can access it with "memory copy" of these objects. For example, a common problem is when a process opens a file and then forks: the file is not fully closed before all processes close the "shared" file descriptor (technically, there are more file descriptor copies all pointing to the same file in the kernel).

@vstinner

Copy link
Copy Markdown
Member Author

I am convinced by these arguments +1

I'm not 100% convinced, so it's good to have such discussion :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants