bpo-44422: Fix threading.enumerate() reentrant call#26727
Conversation
|
@pablogsal @pitrou @serhiy-storchaka @methane: I'm not sure about this change, would you mind to have a look? |
Sorry, something went wrong.
|
If the CI tests pass, I will try the buildbot label to check on more platforms. |
Sorry, something went wrong.
|
🤖 New build scheduled with the buildbot fleet by @vstinner for commit 2907ca88719503493acd930790ffdb978e73dbbb 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
Sorry, something went wrong.
|
(@iritkatriel : a concrete example of GC-induced reentrancy issue) |
Sorry, something went wrong.
|
I ran manual tests on this PR (my laptop has 8 logical CPUs, 4 physical Intel CPUs):
|
Sorry, something went wrong.
|
Oops, there is a typo in a comment :-( "Use a reentrant lock to allocate reentrant calls": you should read "to allow". I will update my PR later, but I prefer to wait until the buildbot jobs complete. |
Sorry, something went wrong.
pitrou
left a comment
There was a problem hiding this comment.
Indeed, a RLock seems like the best effort solution here. Presumably, only the non-mutating code protected by _active_limbo_lock can be called reentrantly (why would a reentrate call mutate the locks table?).
Still, a reminder that a lot of things can unfortunately happen in destructors and trigger reentrancy into unsuspecting code.
Sorry, something went wrong.
These bugs are surprising. Nobody expects the GC reentrancy! In OpenStack, it's even more surprising since threading.current_thread() is monkey-patched for even more fun! |
Sorry, something went wrong.
Really? For what kind of fun? :-o |
Sorry, something went wrong.
Python became too deterministic and boring. It's time to make it non deterministic again! |
Sorry, something went wrong.
|
9 Refleak builds failed:
test__xxsubinterpreters and/or test_threading failed: |
Sorry, something went wrong.
|
The same lock is used in a few other places. Maybe in one of them it's not right to make it re-entrant? |
Sorry, something went wrong.
|
There is also another |
Sorry, something went wrong.
Ah, the https://bugs.python.org/issue42972#msg385297 bug strikes back. I wrote #26727 to fix the leak. |
Sorry, something went wrong.
Oh, nicely stopped @iritkatriel! I will update the PR, once #26727 is merged. |
Sorry, something went wrong.
The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call.
Issue spotted by Irit.
|
I fixed _after_fork() and the typo (allocate => allow), and I rebased it on top of the merged #26734 fix. @iritkatriel: Would you mind to review the updated PR? |
Sorry, something went wrong.
|
This comment is now obsolete: |
Sorry, something went wrong.
Right @iritkatriel, I plan to write a second change only for that, but only change it in the main branch to avoid any risk of regression in stable branches. Oh I forgot to mention it here, I have a local patch for it ;-) So @iritkatriel, does it look good to you? |
Sorry, something went wrong.
iritkatriel
left a comment
There was a problem hiding this comment.
LGTM
Sorry, something went wrong.
|
Thanks @vstinner for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.9. |
Sorry, something went wrong.
The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call. (cherry picked from commit 243fd01) Co-authored-by: Victor Stinner <vstinner@python.org>
The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call. (cherry picked from commit 243fd01) Co-authored-by: Victor Stinner <vstinner@python.org>
|
Thanks everyone for your useful reviews ;-) I wasn't confident at all that this change was safe. At least, the responsibility of any possible regression is now distributed :-D |
Sorry, something went wrong.
@iritkatriel: I created #26741 to use again the |
Sorry, something went wrong.
The threading.enumerate() function now uses a reentrant lock to prevent a hang on reentrant call.
The threading.enumerate() function now uses a reentrant lock to
prevent a hang on reentrant call.
https://bugs.python.org/issue44422