Fix deadlock after fork by youknowone · Pull Request #7205 · RustPython/RustPython

youknowone

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

rg -nP 'fn force_unlock_read|fn force_unlock_write' --type=rust -C3

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

rg -nP 'type PyRwLock|struct PyRwLock|impl.*PyRwLock' --type=rust -C5

Repository: RustPython/RustPython

Length of output: 2107

🏁 Script executed:

rg -nP 'type RawRwLock|struct RawRwLock|impl.*RawRwLock|fn force_unlock' --type=rust -C3

Repository: RustPython/RustPython

Length of output: 5373

🏁 Script executed:

rg -nP 'force_unlock_read|force_unlock_write' --type=rust -B2 -A2

Repository: RustPython/RustPython

Length of output: 2026

🏁 Script executed:

rg -nP 'fn force_unlock_read|fn force_unlock_write' --type=rust -B5 -A5

Repository: RustPython/RustPython

Length of output: 47

🏁 Script executed:

rg -nP 'parking_lot|force_unlock' crates/common/src/lock.rs -B3 -A3

Repository: RustPython/RustPython

Length of output: 306

🏁 Script executed:

rg -nP 'trait.*force_unlock|extension|impl.*RwLock' --type=rust -C3

Repository: RustPython/RustPython

Length of output: 50377

🏁 Script executed:

cat crates/common/src/lock/cell_lock.rs | grep -A10 "force_unlock"

Repository: RustPython/RustPython

Length of output: 47

🌐 Web query:

parking_lot RwLock force_unlock_read force_unlock_write fork safety

💡 Result:

parking_lot’s RwLock::{force_unlock_read, force_unlock_write} (actually provided by lock_api::RwLock) are not “recovery” tools for fork().

They are unsafe escape hatches meant only for the case where the current thread still logically owns the lock but you intentionally leaked the guard (e.g., mem::forget(guard) for FFI), and you need to unlock without having the guard object anymore. Calling them when the lock isn’t held in that mode is undefined behavior. [1]
After fork() in a multi-threaded process, only the forking thread exists in the child, but the child inherits the memory state of all mutex/rwlock objects. If another (now-nonexistent) thread held a lock at fork time, that lock can remain locked forever in the child, and attempting to use such locks can deadlock or worse. [2]
POSIX’s intended mitigation is “fork then immediately exec” (avoid touching library code/allocators/locks in between), or to use pthread_atfork() handlers to put your process into a consistent lock state across fork—but that only works if you can reliably coordinate all relevant locks. [2][3]

So: do not use force_unlock_read/write to “make the child safe after fork”. It violates the ownership assumptions those methods require [1], and it cannot make the protected data consistent if a different thread was in the middle of updating it at fork time. [2]

Sources: [1] [2] [3]

Critical unsafe code misuse: force_unlock_read and force_unlock_write are not safe to use for fork() recovery.

The force_unlock_read() and force_unlock_write() methods from parking_lot are designed only for cases where the current thread still logically owns the lock but intentionally leaked the guard (e.g., via mem::forget). They are not meant to recover from locks held by dead threads after fork(), as they violate the ownership assumptions these methods require. Calling them when a different (now-nonexistent) thread held the lock is undefined behavior.

After fork(), using pthread_atfork() handlers or relying on the POSIX principle of "fork then immediately exec" are the intended mitigations. The current approach of force-unlocking locks held by other threads is unsafe and should not be used.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/vm/src/gc_state.rs` around lines 793 - 856, The code incorrectly uses
parking_lot's force_unlock_read/force_unlock_write (and force_unlock) in
force_unlock_after_fork and force_unlock_rwlock_after_fork to recover locks held
by other threads after fork, which is undefined behavior; remove any calls to
force_unlock_read/force_unlock_write/force_unlock and stop trying to forcibly
recover locks from dead threads. Instead, change force_unlock_after_fork and
force_unlock_rwlock_after_fork to not perform unsafe forcible unlocks: either
(A) make them a no-op that logs/returns an Err indicating fork-without-exec is
unsupported, or (B) abort/panic with a clear message instructing callers to use
pthread_atfork handlers or ensure fork is followed by exec; also update any
callers to handle the error or avoid calling these functions after fork. Ensure
references: update function force_unlock_after_fork, helper
force_unlock_rwlock_after_fork, and remove usages of
PyRwLock::force_unlock_read, PyRwLock::force_unlock_write, and
Mutex::force_unlock in this module.