gh-99113: Make Sure the GIL is Acquired at the Right Places#104208
gh-99113: Make Sure the GIL is Acquired at the Right Places#104208ericsnowcurrently merged 7 commits into
Conversation
|
There's a failure, I can reproduce it on Windows. (originally taken from import threading
import _xxsubinterpreters as interpreters
def foo():
t = threading.Thread(target=interpreters.create)
t.start()
t.join()
foo()
foo()Traceback: C:\Users\KIRILL-1\CLionProjects\cpython> ./python example.py
Running Debug|x64 interpreter...
Fatal Python error: drop_gil: drop_gil: GIL is not locked
Python runtime state: initialized
Current thread 0x00001900 (most recent call first):
<no Python frame> |
Sorry, something went wrong.
Sorry, something went wrong.
ad1e40b to
d6c15f9
Compare
May 5, 2023 21:46
|
@Eclips4, I've rebased this branch. Do you still get the crash? |
Sorry, something went wrong.
d6c15f9 to
2404310
Compare
May 5, 2023 22:03
|
Hello Eric, sorry for the wait PS C:\Users\KIRILL-1\CLionProjects\cpython> ./python example.py
Running Debug|x64 interpreter...
Fatal Python error: drop_gil: drop_gil: GIL is not locked
Python runtime state: initialized
Current thread 0x00003a74 (most recent call first):
<no Python frame> |
Sorry, something went wrong.
|
It's kinda interesting why it fails only on Windows. Maybe try build with buildbots? |
Sorry, something went wrong.
|
Ah, looks like I was releasing the GIL unnecessarily in |
Sorry, something went wrong.
|
FTR, there's the stack trace before my fix: (expand) |
Sorry, something went wrong.
|
Yeah, after last commit error will go away. |
Sorry, something went wrong.
That is interesting. I figured it would be worth exploring why that unexpected situation happened. Below is my analysis. tl;dr in addition to dropping that (So thanks for pointing this out.) Analysis(expand)The immediate problem was that we were trying to release the GIL when it wasn't held (but did already exist). This was happening, specifically, in This may mean I broke that expectation somewhere with one of my relatively recent commits (or in this PR). Or it might be a long-standing bug that I exposed with some new branch in the code. Regardless, the failure implies broken expectations somewhere. Here are the relevant expectations I thought of, particularly relative to the failure and leading up to the call to
I checked each of these in turn, comparing on linux (where I develop) and Windows (where the failure happened), and the expectations were valid. However, in two places the GIL changed state unexpectedly (noted above without a check mark). This only happened on Windows. I'm pretty sure that the following happened:
Why did this only happen on Windows. I expect (but have not verified) that the semantics of releasing and re-acquiring the GIL in the eval loop are different on Windows and linux. ConclusionAll that demonstrates two issues (one of which I've already addressed):
The solution for |
Sorry, something went wrong.
…thongh-104208) This is a pre-requisite for a per-interpreter GIL. Without it this change isn't strictly necessary. However, there is no real downside otherwise.
|
@ericsnowcurrently a bit of a long shot and way late to the party -- but I've recently bisected uwsgi (which deadlocks on python 3.12 only after a reload) and the bisection claims this commit as the culprit -- you can find more analysis in unbit/uwsgi#2659 my guess without debugging much yet is |
Sorry, something went wrong.
|
I believe there's an oversight in this patch which changes the behaviour of this patch "fixes" my problem: diff --git a/Python/pystate.c b/Python/pystate.c
index f14934361da..218c08a8528 100644
--- a/Python/pystate.c
+++ b/Python/pystate.c
@@ -1919,7 +1919,7 @@ _PyThreadState_Swap(_PyRuntimeState *runtime, PyThreadState *newts)
PyThreadState *
PyThreadState_Swap(PyThreadState *newts)
{
- return _PyThreadState_Swap(&_PyRuntime, newts);
+ return _PyThreadState_SwapNoGIL(newts);
}
|
Sorry, something went wrong.
in python#104208 it was inadvertently (?) changed to acquire the GIL
This is a pre-requisite for a per-interpreter GIL. Without it this change isn't strictly necessary. However, there is no downside otherwise.
(This change is broken out from gh-99114.)