bpo-46210: Fix deadlock in print.#30310
Conversation
|
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA). CLA MissingOur records indicate the following people have not signed the CLA: For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. If you have recently signed the CLA, please wait at least one business day You can check yourself to see if the CLA has been received. Thanks again for the contribution, we look forward to reviewing it! |
Sorry, something went wrong.
sweeneyde
left a comment
There was a problem hiding this comment.
Just some superficial things, I'm not that knowledgable about the whole threading/forking situation.
Sorry, something went wrong.
|
It would be nice to see a test case added for these methods. If a threading/forking case can be made to pass consistently, that would be nice, but even if not, just a simple exercise of the |
Sorry, something went wrong.
|
Thanks for the tips! I know exactly what to learn about next. Will push some updates after I learn a bit more. I think the general "unlock the thing" idea probably makes sense, but I'm sure I tripped over myself in 17 different ways in the initial patch. Curious to hear from Victor about whether the overall goal is a sane one. Either way, more coming soon. Happy New Year. :) |
Sorry, something went wrong.
|
Ok, re: Just updated the pull request with what (I think?) is correct reference counting. All the tests pass, but I wanted to check for reference leaks like you suggested. So I ran several of the relevant tests with Supposing there's a reference leak, would it be noticed by Is there a standard way to check for leaks in situations like that, or is it better to just use a brute force strategy like I did in the comment above (i.e., calling every function twice and checking |
Sorry, something went wrong.
|
Hi there. :) After spending some time thinking about how to exercise the lock re-initialization function in a test, I realized that the best way to overcome my noobish-ness here is to follow the example of The "meat" of the implementation we've already discussed doesn't change (thanks again for the tips @sweeneyde). I've just moved things to different files, in order to maximize the similarity with what Here's what a grep of the codebase for the two functions looks like: And here's the The goal was to get line-by-line correspondence with what @vstinner does with the Now, the only difference arises inside the implementation, because the import lock is a global variable inside As for everything else: (1) the PEP 7 formatting should be good now, (2) @the-knights-who-say-ni seem to have given me the "CLA signed" badge just now, (3) in playing with all this, I stumbled on another small unrelated bug in the test suite and fixed it, so I'll submit that as a separate bpo & pull request. Let me know if there's anything else you'd like me to do. Thanks again for all the help. :) |
Sorry, something went wrong.
JelleZijlstra
left a comment
There was a problem hiding this comment.
Agree that this needs tests. It also adds a new Python-accessible method. What happens if Python code calls it?
Sorry, something went wrong.
|
Re: Helpful comments from @JelleZijlstra
100% agreed, the Python-accessible method wasn't necessary. I needed to learn a bit more before I could do this correctly. Just pushed some changes to address this. The updated approach:
The test's behavior is to cause the deadlock described in bpo-46210 if and only if the lock is not being reinitialized correctly. Just like the other C-level locks in the codebase, this is something that should be reinitialized on every fork, so you (@JelleZijlstra) were absolutely right to point out that a Python-accessible interface was an awkward fit. Further, I want to make it clear to other devs precisely what situation the bugfix is targeting, so the test contains a description of what the bug looks like without the fix, and which specific C function was added to fix it. Summary: the test can be run with |
Sorry, something went wrong.
|
In addition to the above: In the course of adding the test, I found a small bug in the That won't be relevant to anyone who runs the test as is, since the PR fixes the deadlock. However, if anyone feels like running the above test after removing the bugfix to verify that the test fails, as it should (if we remove That's due to the above mentioned bug in Should I submit another bpo for this small bug, or just send a PR directly? Should be a one-line fix. Just need to change |
Sorry, something went wrong.
|
Thanks, can you open another issue for the test.support fix? (I don't have time right now to look at this PR in detail, but I'll review it again.) |
Sorry, something went wrong.
Will do! :) Thanks for the help. |
Sorry, something went wrong.
This commit: * Properly identifies whether we need to reinit a stream's lock. * Changes 'stdio' to 'stream' in several functions for clarity. * Also reinitialize stdin's buffer->lock at fork.
|
Mind if I rebase this on main and force push a small update? I fixed the PR to be consistent with #30928 (bpo-46541). i.e., (P.S. Am I imagining things, or do recent changes in main feel very Sam Gross-y? Trying not to get too excited, but I hope that's what's going on. 🎉) |
Sorry, something went wrong.
|
Generally better to merge main, so that the review history gets preserved. |
Sorry, something went wrong.
|
Cool, so this pattern?
Apologies for the dumb question. I've used git socially and github personally, but I'm still fairly new to using github socially. :) |
Sorry, something went wrong.
|
Something like this:
GitHub also lets you do this in the web interface if there's a conflict (the "Resolve conflicts" button at the bottom of the thread). |
Sorry, something went wrong.
|
Thanks! |
Sorry, something went wrong.
Add
_at_fork_reinithandler tobuffer->lockofstdoutandstderr, to fix a deadlock in print.https://bugs.python.org/issue46210