bpo-22393: Fix multiprocessing.Pool hangs if a worker process dies unexpectedly#10441
bpo-22393: Fix multiprocessing.Pool hangs if a worker process dies unexpectedly#10441oesteban wants to merge 16 commits into
Conversation
|
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Our records indicate we have not received your CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. If you have recently signed the CLA, please wait at least one business day You can check yourself to see if the CLA has been received. Thanks again for your contribution, we look forward to reviewing it! |
Sorry, something went wrong.
This PR relates to nipy#2700, and should fix the problem underlying nipy#2548. I first considered adding a control thread that monitors the `Pool` of workers, but that would require a large overhead keeping track of PIDs and polling very often. Just adding the core file of [bpo-22393](python/cpython#10441) should fix nipy#2548
effigies
left a comment
There was a problem hiding this comment.
Just a couple comments, pending review from the cpython devs.
Sorry, something went wrong.
|
Hi @pitrou (or anyone with a say), can you give us a hint about the fate of this PR (even if you honestly think it does not have a very promising future). Thanks |
Sorry, something went wrong.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase And if you don't make the requested changes, you will be poked with soft cushions! |
Sorry, something went wrong.
|
I have made the requested changes; please review again |
Sorry, something went wrong.
|
Thanks for making the requested changes! @pitrou: please review the changes made to this pull request. |
Sorry, something went wrong.
|
pinging @pitrou, at least to know if the changes pointed at the right direction. |
Sorry, something went wrong.
|
Sorry, will take a look again. Also @pablogsal you may be interested in this. |
Sorry, something went wrong.
|
bumping up! |
Sorry, something went wrong.
|
Are there any plans for deprecating multiprocessing? Otherwise, I think this bug should be addressed. If the proposed fix is not the right way of fixing it, please let me know. I'll resolve the conflicts only once I know there is interest in doing so. Thanks very much |
Sorry, something went wrong.
|
Yes I can have a look. |
Sorry, something went wrong.
|
I'll have a look too. |
Sorry, something went wrong.
pierreglaser
left a comment
There was a problem hiding this comment.
Here is a first review. @tomMoral's one should land sometime next week :)
Sorry, something went wrong.
|
There are multiple tests being added that make use of sleep to synchronize processes (in particular it assumes that the processes will be entered in time when sleep finishes). This is very unreliable and it will most certainly fail on the slowest buildbots. Please, try to add some synchronization to the tests to make them more deterministic. |
Sorry, something went wrong.
|
Note that this PR, while improving the current state of the import sys
from multiprocessing import Pool
pool = Pool(2)
pool.apply(sys.exit, (0,))or at unpickling time import sys
from multiprocessing import Pool
class Failure:
def __reduce__(self):
return sys.exit, (0, )
pool = Pool(2)
pool.apply(id, (Failure(),))Also, many other problems exists with Maybe a more stable solution would be to actually change the |
Sorry, something went wrong.
|
This PR is stale because it has been open for 30 days with no activity. |
Sorry, something went wrong.
This PR fixes issue22393.
Three new unittests have been added.
https://bugs.python.org/issue22393