Fix thread-safety in GC, type cache, and instruction cache#7355
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a CAS-based ownership upgrade ( Changes
Sequence DiagramsequenceDiagram
participant GC as GC Collector
participant PyObj as PyObject
participant RefCnt as RefCount
participant Callback as CallbackHandler
Note over GC,Callback: Guarded ownership attempt before using object
GC->>PyObj: try_to_owned()
activate PyObj
PyObj->>RefCnt: safe_inc() (CAS)
activate RefCnt
RefCnt->>RefCnt: check destructed()
RefCnt->>RefCnt: check strong() == 0
alt guard fails
RefCnt-->>PyObj: return None
else guard passes
RefCnt-->>PyObj: increment strong -> return Some(PyObjectRef)
end
deactivate RefCnt
deactivate PyObj
alt Ownership acquired
GC->>Callback: invoke callback with owned PyObjectRef
else Ownership failed
GC->>GC: skip / deopt to slow path
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Sorry, something went wrong.
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Sorry, something went wrong.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/vm/src/vm/mod.rs (1)
1322-1329: Consolidate duplicated dict-key snapshot branches.Both branches do the same map/collect after selecting key source; extracting only the differing source keeps this tighter and easier to maintain.
Refactor sketch
- } else if cls.is(self.ctx.types.dict_type) { - // Atomic snapshot of dict keys for thread-safe iteration. - let keys = value.downcast_ref::<PyDict>().unwrap().keys_vec(); - return keys.into_iter().map(func).collect(); - } else if cls.is(self.ctx.types.dict_keys_type) { - // Atomic snapshot of dict keys - prevents race condition during iteration - let keys = value.downcast_ref::<PyDictKeys>().unwrap().dict.keys_vec(); + } else if cls.is(self.ctx.types.dict_type) || cls.is(self.ctx.types.dict_keys_type) { + let keys = if cls.is(self.ctx.types.dict_type) { + value.downcast_ref::<PyDict>().unwrap().keys_vec() + } else { + value.downcast_ref::<PyDictKeys>().unwrap().dict.keys_vec() + }; return keys.into_iter().map(func).collect();As per coding guidelines, "When branches differ only in a value but share common logic, extract the differing value first, then call the common logic once to avoid duplicate code".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/vm/mod.rs` around lines 1322 - 1329, The two branches that handle dict and dict_keys are duplicated: both compute an atomic snapshot via keys_vec() and then map/collect with func; refactor by first selecting the key source into a single variable (e.g., if cls.is(self.ctx.types.dict_type) use value.downcast_ref::<PyDict>().unwrap().keys_vec(), else if cls.is(self.ctx.types.dict_keys_type) use value.downcast_ref::<PyDictKeys>().unwrap().dict.keys_vec()), assign that Vec to a local variable (e.g., keys), and then perform keys.into_iter().map(func).collect() once; update the code around the cls checks and preserve the unwrap() calls and func usage so behavior is unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/vm/src/vm/mod.rs`:
- Around line 1322-1329: The two branches that handle dict and dict_keys are
duplicated: both compute an atomic snapshot via keys_vec() and then map/collect
with func; refactor by first selecting the key source into a single variable
(e.g., if cls.is(self.ctx.types.dict_type) use
value.downcast_ref::<PyDict>().unwrap().keys_vec(), else if
cls.is(self.ctx.types.dict_keys_type) use
value.downcast_ref::<PyDictKeys>().unwrap().dict.keys_vec()), assign that Vec to
a local variable (e.g., keys), and then perform
keys.into_iter().map(func).collect() once; update the code around the cls checks
and preserve the unwrap() calls and func usage so behavior is unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 5893b646-9765-4d5f-9945-0b0e63fdcb36
⛔ Files ignored due to path filters (1)
Lib/test/_test_multiprocessing.pyis excluded by!Lib/**
📒 Files selected for processing (4)
crates/common/src/refcount.rscrates/vm/src/gc_state.rscrates/vm/src/object/core.rscrates/vm/src/vm/mod.rs
Sorry, something went wrong.
f56a0b3 to
95ba6ee
Compare
March 5, 2026 05:21
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode.rs (1)
630-635: Documentread_cache_ptrpanic behavior for API parity.This method indexes directly and can panic on out-of-bounds; adding
# Panicswould align it withread_cache_u16.Suggested doc update
/// Load a pointer-sized value atomically from the pointer cache at `index`. /// /// Uses a single `AtomicUsize` load to prevent torn reads. + /// + /// # Panics + /// Panics if `index` is out of bounds. pub fn read_cache_ptr(&self, index: usize) -> usize { self.pointer_cache[index].load(Ordering::Relaxed) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/compiler-core/src/bytecode.rs` around lines 630 - 635, The doc comment for read_cache_ptr is missing a Panics section describing that direct indexing into pointer_cache can panic on out-of-bounds; update the documentation for pub fn read_cache_ptr(&self, index: usize) -> usize to include a "# Panics" paragraph mirroring read_cache_u16, stating that it will panic if index is out of bounds of self.pointer_cache (because it uses direct indexing), so callers know the behavior and API parity is maintained.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/frame.rs`:
- Around line 3696-3706: The code pops the attribute owner via self.pop_value()
before confirming the property getter exists, which mutates the stack and causes
load_attr_slow() to double-pop on the slow path; change the sequence in the
read_cache_ptr handling (around read_cache_ptr, PyProperty, get_fget, pop_value,
push_value) to query prop.get_fget() first and only call self.pop_value() after
you have Some(getter), then call getter.call((owner,), vm) and
push_value(result); this ensures the stack is not modified when fget is absent
and preserves correct behavior when falling back to load_attr_slow().
---
Nitpick comments:
In `@crates/compiler-core/src/bytecode.rs`:
- Around line 630-635: The doc comment for read_cache_ptr is missing a Panics
section describing that direct indexing into pointer_cache can panic on
out-of-bounds; update the documentation for pub fn read_cache_ptr(&self, index:
usize) -> usize to include a "# Panics" paragraph mirroring read_cache_u16,
stating that it will panic if index is out of bounds of self.pointer_cache
(because it uses direct indexing), so callers know the behavior and API parity
is maintained.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 831c40c5-46e1-44fe-9ddc-925a16fc0e57
⛔ Files ignored due to path filters (1)
Lib/test/_test_multiprocessing.pyis excluded by!Lib/**
📒 Files selected for processing (7)
crates/common/src/refcount.rscrates/compiler-core/src/bytecode.rscrates/vm/src/builtins/type.rscrates/vm/src/frame.rscrates/vm/src/gc_state.rscrates/vm/src/object/core.rscrates/vm/src/vm/mod.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/vm/src/vm/mod.rs
- crates/vm/src/gc_state.rs
Sorry, something went wrong.
95ba6ee to
2adc355
Compare
March 5, 2026 05:31
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
crates/vm/src/frame.rs (1)
3702-3704: ⚠️ Potential issue | 🔴 CriticalAvoid popping
ownerbefore confirming property getter exists.Line 3702 pops from the stack before
prop.get_fget()is validated at Line 3703. Iffgetis missing, fallback toload_attr_slow()runs with a mutated stack and can double-pop.Proposed fix
- let owner = self.pop_value(); - if let Some(getter) = prop.get_fget() { + if let Some(getter) = prop.get_fget() { + let owner = self.pop_value(); let result = getter.call((owner,), vm)?; self.push_value(result); return Ok(None); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/frame.rs` around lines 3702 - 3704, The code pops the owner unconditionally before checking prop.get_fget(), which corrupts the stack if fget is None and load_attr_slow() runs; change the flow so you first check prop.get_fget() and only call self.pop_value() when the getter exists (i.e., move the let owner = self.pop_value() inside the if let Some(getter) branch), and ensure the else path calls load_attr_slow() without having mutated the stack (use existing stack state/peek rather than pop). Update uses of getter.call((owner,), vm) accordingly so owner is only created when needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 741-754: The code dereferences the raw pointer ptr into &*ptr
(creating a &PyObject) before calling obj.try_to_owned(), causing a potential
use-after-free; change the flow so you do not form a reference before the
ownership upgrade: call try_to_owned (or a new unsafe wrapper that accepts *mut
PyObject) using the raw pointer directly and only create the &PyObject / &*ptr
after try_to_owned succeeds (or return the owned clone without ever forming a
temporary reference), then perform the seq-lock version recheck (entry.version
v2 vs v1) and return cloned; update usages of try_to_owned/safe_inc to accept a
raw pointer if needed and remove the early &*ptr dereference to avoid undefined
behavior.
---
Duplicate comments:
In `@crates/vm/src/frame.rs`:
- Around line 3702-3704: The code pops the owner unconditionally before checking
prop.get_fget(), which corrupts the stack if fget is None and load_attr_slow()
runs; change the flow so you first check prop.get_fget() and only call
self.pop_value() when the getter exists (i.e., move the let owner =
self.pop_value() inside the if let Some(getter) branch), and ensure the else
path calls load_attr_slow() without having mutated the stack (use existing stack
state/peek rather than pop). Update uses of getter.call((owner,), vm)
accordingly so owner is only created when needed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 6116694d-725f-4541-abfd-4e002eba8b2c
📒 Files selected for processing (3)
crates/compiler-core/src/bytecode.rscrates/vm/src/builtins/type.rscrates/vm/src/frame.rs
Sorry, something went wrong.
GC / refcount: - Add safe_inc() check for strong()==0 in RefCount - Add try_to_owned() to PyObject for atomic refcount acquire - Replace strong_count()+to_owned() with try_to_owned() in GC collection and weakref callback paths to prevent TOCTOU races Type cache: - Add proper SeqLock (sequence counter) to TypeCacheEntry - Readers spin-wait on odd sequence, validate after read - Writers bracket updates with begin_write/end_write - Use try_to_owned + pointer revalidation on read path - Call modified() BEFORE attribute modification in set_attr Instruction cache: - Add pointer_cache (AtomicUsize array) to CodeUnits for single atomic pointer load/store (prevents torn reads) - Add try_read_cached_descriptor with try_to_owned + pointer and version revalidation after increment - Add write_cached_descriptor with version-bracketed writes RLock: - Fix release() to check is_owned_by_current_thread - Add _release_save/_acquire_restore methods
2adc355 to
70169e8
Compare
March 5, 2026 08:01
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
crates/vm/src/builtins/type.rs (1)
786-788: ⚠️ Potential issue | 🔴 CriticalDo not form
&*ptrbefore ownership upgrade; this is still a UAF/UB window.At Line 786, creating
&PyObjectfrom rawptrhappens before successfultry_to_owned(). If another thread drops the cached strong ref first, that dereference is invalid.Suggested direction
- let obj: &PyObject = unsafe { &*ptr }; - if let Some(cloned) = obj.try_to_owned() { + // Upgrade from raw pointer atomically first; only then materialize references. + if let Some(cloned) = unsafe { PyObject::try_to_owned_raw(ptr) } { let same_ptr = core::ptr::eq(entry.value.load(Ordering::Relaxed), ptr); if same_ptr && entry.end_read(seq1) { return Some(cloned); } drop(cloned); continue; }(If
try_to_owned_rawdoes not exist yet, add it incrates/vm/src/object/core.rsand keep all pre-upgrade handling pointer-only.)#!/bin/bash set -euo pipefail echo "== cache read path around raw deref ==" sed -n '770,800p' crates/vm/src/builtins/type.rs echo echo "== all raw-pointer -> reference conversions in type cache path ==" rg -n -C3 'unsafe \{ &\*ptr \}|try_to_owned\(' crates/vm/src/builtins/type.rs🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/type.rs` around lines 786 - 788, The code currently forms a reference with `let obj: &PyObject = unsafe { &*ptr };` before calling `obj.try_to_owned()`, which can cause a UAF/UB if another thread drops the cached strong ref; change the upgrade path to operate on the raw pointer only (e.g., add and use a pointer-based upgrade like `try_to_owned_raw` in crates/vm/src/object/core.rs) so you never create `&PyObject` before the ownership is confirmed; update the call site in the type cache (the block using `ptr`, `entry.value.load(Ordering::Relaxed)`, and `try_to_owned`) to call the new raw-pointer upgrade routine and only create a safe reference after the ownership upgrade succeeds and the pointer is still the same.
🧹 Nitpick comments (1)
crates/compiler-core/src/bytecode.rs (1)
619-638: Add focused tests for the new pointer-cache API.Please add direct coverage for round-trip read/write and out-of-bounds panic behavior to prevent regressions in this concurrency-critical path.
✅ Suggested tests
#[cfg(test)] mod tests { use super::*; use alloc::{vec, vec::Vec}; + #[test] + fn test_pointer_cache_roundtrip_and_clone() { + let units = CodeUnits::from(vec![CodeUnit::new(Instruction::Cache, 0u8.into())]); + unsafe { + units.write_cache_ptr(0, 0x1234usize); + } + assert_eq!(units.read_cache_ptr(0), 0x1234usize); + + let cloned = units.clone(); + assert_eq!(cloned.read_cache_ptr(0), 0x1234usize); + } + + #[test] + #[should_panic] + fn test_pointer_cache_oob_panics() { + let units = CodeUnits::from(vec![CodeUnit::new(Instruction::Cache, 0u8.into())]); + let _ = units.read_cache_ptr(1); + } + #[test] fn test_exception_table_encode_decode() {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/compiler-core/src/bytecode.rs` around lines 619 - 638, Add focused unit tests exercising the pointer-cache API: create tests that (1) perform a round-trip using write_cache_ptr and read_cache_ptr to store and retrieve a usize value (covering typical and edge values) and assert equality, and (2) assert that read_cache_ptr and unsafe write_cache_ptr panic or trigger UB when given an out-of-bounds index (use std::panic::catch_unwind or #[should_panic] as appropriate). Locate usages of pointer_cache, read_cache_ptr, and write_cache_ptr to implement these tests, and include concurrent access variants if possible to exercise atomic behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/compiler-core/src/bytecode.rs`:
- Around line 624-627: The Safety docs for unsafe fn write_cache_ptr currently
only require an in-bounds index; extend the contract to state that `value` must
be either 0 (sentinel) or a valid `*const PyObject` encoded as a usize, and that
callers must follow the cache invalidation/upgrade protocol used by
write_cached_descriptor (i.e., ensure version guards are published/checked in
the proper order before any reader dereferences the stored pointer such as the
dereference at descr_ptr in frame.rs); update the doc comment on write_cache_ptr
to enumerate these constraints so callers know they must write only valid
pointer values and respect the version-guard semantics.
In `@crates/vm/src/builtins/type.rs`:
- Around line 106-113: begin_write currently uses sequence.fetch_add(1) which
allows concurrent writers and breaks the SeqLock invariant; change it to acquire
an exclusive writer lock (e.g., a dedicated atomic flag or Mutex) before making
the sequence odd, so only one writer can enter, then increment sequence to mark
the start of write; in end_write release the lock after incrementing sequence to
mark write completion. Update the implementations of begin_write and end_write
and add/use the new writer lock field (referencing sequence, begin_write,
end_write) to ensure writers are serialized.
In `@crates/vm/src/stdlib/thread.rs`:
- Around line 295-305: The _acquire_restore implementation currently ignores the
saved owner; update fn _acquire_restore(&self, state: (usize, u64), _vm:
&VirtualMachine) to unpack and validate the saved owner against the current
thread id (use current_thread_id() or the crate's equivalent) and return an
error if they differ instead of silently accepting it; only when the owner
matches proceed to re-lock by calling self.mu.lock() count times and restore
self.count.store(count, Ordering::Relaxed). Ensure the error return follows the
surrounding error type/contract used by other methods (e.g., match the error
style used by _release_save/_acquire) so callers receive a proper
protocol-mismatch failure.
---
Duplicate comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 786-788: The code currently forms a reference with `let obj:
&PyObject = unsafe { &*ptr };` before calling `obj.try_to_owned()`, which can
cause a UAF/UB if another thread drops the cached strong ref; change the upgrade
path to operate on the raw pointer only (e.g., add and use a pointer-based
upgrade like `try_to_owned_raw` in crates/vm/src/object/core.rs) so you never
create `&PyObject` before the ownership is confirmed; update the call site in
the type cache (the block using `ptr`, `entry.value.load(Ordering::Relaxed)`,
and `try_to_owned`) to call the new raw-pointer upgrade routine and only create
a safe reference after the ownership upgrade succeeds and the pointer is still
the same.
---
Nitpick comments:
In `@crates/compiler-core/src/bytecode.rs`:
- Around line 619-638: Add focused unit tests exercising the pointer-cache API:
create tests that (1) perform a round-trip using write_cache_ptr and
read_cache_ptr to store and retrieve a usize value (covering typical and edge
values) and assert equality, and (2) assert that read_cache_ptr and unsafe
write_cache_ptr panic or trigger UB when given an out-of-bounds index (use
std::panic::catch_unwind or #[should_panic] as appropriate). Locate usages of
pointer_cache, read_cache_ptr, and write_cache_ptr to implement these tests, and
include concurrent access variants if possible to exercise atomic behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 8a57c331-cc95-44dc-8045-0b04cccb47ed
⛔ Files ignored due to path filters (1)
Lib/test/_test_multiprocessing.pyis excluded by!Lib/**
📒 Files selected for processing (8)
crates/common/src/refcount.rscrates/compiler-core/src/bytecode.rscrates/vm/src/builtins/type.rscrates/vm/src/frame.rscrates/vm/src/gc_state.rscrates/vm/src/object/core.rscrates/vm/src/stdlib/thread.rscrates/vm/src/vm/mod.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/common/src/refcount.rs
- crates/vm/src/vm/mod.rs
Sorry, something went wrong.
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] lib: cpython/Lib/threading.py dependencies:
dependent tests: (148 tests)
[ ] lib: cpython/Lib/zipapp.py dependencies:
dependent tests: (1 tests)
Legend:
|
Sorry, something went wrong.
There was a problem hiding this comment.
♻️ Duplicate comments (2)
crates/vm/src/builtins/type.rs (1)
807-809: ⚠️ Potential issue | 🔴 CriticalUnsafe
&*ptris created before ownership upgrade.At Line 807, the code builds
&PyObjectfrom a raw cache pointer beforetry_to_owned(). Under concurrent writer replacement/drop, this can create a reference to freed memory.Suggested direction
- let obj: &PyObject = unsafe { &*ptr }; - if let Some(cloned) = obj.try_to_owned() { + // Avoid creating `&PyObject` before ownership is safely upgraded. + // Use a raw-pointer upgrade helper (e.g. `PyObject::try_to_owned_from_ptr(ptr)`) + // that performs safe refcount CAS first, then materializes owned ref. + if let Some(cloned) = unsafe { PyObject::try_to_owned_from_ptr(ptr) } {#!/bin/bash set -euo pipefail echo "=== Cache-hit dereference site ===" rg -n -C4 'let obj: &PyObject = unsafe \{ &\*ptr \};|try_to_owned\(' crates/vm/src/builtins/type.rs echo echo "=== Ownership upgrade APIs currently available ===" rg -n -C4 'fn try_to_owned|safe_inc' crates/vm/src/object/core.rs crates/common/src/refcount.rs🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/builtins/type.rs` around lines 807 - 809, The code currently forms a borrowed reference let obj: &PyObject = unsafe { &*ptr } before attempting an ownership upgrade, which can produce a dangling reference under concurrent replacement; instead, perform the ownership upgrade first from the raw pointer (use the available try_to_owned API on the raw pointer/handle returned by entry.value.load(Ordering::Relaxed)) and only if try_to_owned succeeds obtain a safe reference from the owned handle (e.g., owned.as_ref()/deref) to use for validation; also compute same_ptr against the original raw pointer value from entry.value.load so comparisons remain correct (avoid creating any &PyObject from ptr until after successful try_to_owned).crates/vm/src/stdlib/thread.rs (1)
295-303: ⚠️ Potential issue | 🟠 Major
_acquire_restorecurrently ignores saved owner and can accept cross-thread state.At Line 302,
owneris parsed but discarded. That lets a different thread call_acquire_restorewith foreign saved state and proceed, which diverges from the expected restore contract.Suggested fix
fn _acquire_restore(&self, state: PyTupleRef, vm: &VirtualMachine) -> PyResult<()> { let [count_obj, owner_obj] = state.as_slice() else { return Err( vm.new_type_error("_acquire_restore() argument 1 must be a 2-item tuple") ); }; let count: usize = count_obj.clone().try_into_value(vm)?; - let _owner: u64 = owner_obj.clone().try_into_value(vm)?; + let owner: u64 = owner_obj.clone().try_into_value(vm)?; + if owner != current_thread_id() { + return Err(vm.new_runtime_error( + "cannot restore lock acquired by another thread".to_owned(), + )); + } if count == 0 { return Ok(()); }#!/bin/bash set -euo pipefail echo "=== Rust _thread RLock restore implementation ===" sed -n '281,312p' crates/vm/src/stdlib/thread.rs echo echo "=== Python threading protocol reference ===" sed -n '251,263p' Lib/threading.py🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/vm/src/stdlib/thread.rs` around lines 295 - 303, The _acquire_restore implementation parses but ignores the saved owner; change it to validate that the restored owner matches the current thread before accepting the state. In function _acquire_restore, keep the parsed owner (currently assigned to _owner), fetch the current thread's native id (or equivalent runtime thread id used elsewhere in this module), and if owner != current_thread_id return an Err(vm.new_runtime_error(...) or appropriate PyError) so cross-thread restore is rejected; only when the owner matches and count is valid should you proceed with the acquire restore logic. Ensure you reference and use the symbols _acquire_restore, owner (parsed from state), and whatever helper/field provides the current thread id in this module.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@crates/vm/src/builtins/type.rs`:
- Around line 807-809: The code currently forms a borrowed reference let obj:
&PyObject = unsafe { &*ptr } before attempting an ownership upgrade, which can
produce a dangling reference under concurrent replacement; instead, perform the
ownership upgrade first from the raw pointer (use the available try_to_owned API
on the raw pointer/handle returned by entry.value.load(Ordering::Relaxed)) and
only if try_to_owned succeeds obtain a safe reference from the owned handle
(e.g., owned.as_ref()/deref) to use for validation; also compute same_ptr
against the original raw pointer value from entry.value.load so comparisons
remain correct (avoid creating any &PyObject from ptr until after successful
try_to_owned).
In `@crates/vm/src/stdlib/thread.rs`:
- Around line 295-303: The _acquire_restore implementation parses but ignores
the saved owner; change it to validate that the restored owner matches the
current thread before accepting the state. In function _acquire_restore, keep
the parsed owner (currently assigned to _owner), fetch the current thread's
native id (or equivalent runtime thread id used elsewhere in this module), and
if owner != current_thread_id return an Err(vm.new_runtime_error(...) or
appropriate PyError) so cross-thread restore is rejected; only when the owner
matches and count is valid should you proceed with the acquire restore logic.
Ensure you reference and use the symbols _acquire_restore, owner (parsed from
state), and whatever helper/field provides the current thread id in this module.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 76a8b21b-1f36-480c-9237-ad3e45b51fbb
⛔ Files ignored due to path filters (1)
Lib/test/test_threading.pyis excluded by!Lib/**
📒 Files selected for processing (3)
crates/compiler-core/src/bytecode.rscrates/vm/src/builtins/type.rscrates/vm/src/stdlib/thread.rs
Sorry, something went wrong.
Instead of calling lock()/unlock() N times for recursion depth N, keep parking_lot at 1 level and manage the count ourselves. This makes acquire/release O(1) and matches CPython's _PyRecursiveMutex approach (lock once + set level directly).
Use addr_of! to access ref_count directly from a raw pointer without forming &PyObject first. Applied in type cache and instruction cache hit paths where the pointer may be stale.
411c3a1 to
3e31d22
Compare
March 5, 2026 09:05
- Fix "minimising" -> "minimizing" for cspell - xfail test_thread_safety: dict iteration races with concurrent GC mutations in _finalizer_registry
375b547
into
RustPython:main
Mar 5, 2026
…n#7355) * Fix thread-safety in GC, type cache, and instruction cache GC / refcount: - Add safe_inc() check for strong()==0 in RefCount - Add try_to_owned() to PyObject for atomic refcount acquire - Replace strong_count()+to_owned() with try_to_owned() in GC collection and weakref callback paths to prevent TOCTOU races Type cache: - Add proper SeqLock (sequence counter) to TypeCacheEntry - Readers spin-wait on odd sequence, validate after read - Writers bracket updates with begin_write/end_write - Use try_to_owned + pointer revalidation on read path - Call modified() BEFORE attribute modification in set_attr Instruction cache: - Add pointer_cache (AtomicUsize array) to CodeUnits for single atomic pointer load/store (prevents torn reads) - Add try_read_cached_descriptor with try_to_owned + pointer and version revalidation after increment - Add write_cached_descriptor with version-bracketed writes RLock: - Fix release() to check is_owned_by_current_thread - Add _release_save/_acquire_restore methods * Fix RLock _acquire_restore tuple handling and unxfail threading test * Align type cache seqlock writer protocol with CPython * RLock: use single parking_lot level, track recursion manually Instead of calling lock()/unlock() N times for recursion depth N, keep parking_lot at 1 level and manage the count ourselves. This makes acquire/release O(1) and matches CPython's _PyRecursiveMutex approach (lock once + set level directly). * Add try_to_owned_from_ptr to avoid &PyObject on stale ptrs Use addr_of! to access ref_count directly from a raw pointer without forming &PyObject first. Applied in type cache and instruction cache hit paths where the pointer may be stale. * Fix CI: spelling typo and xfail flaky test_thread_safety - Fix "minimising" -> "minimizing" for cspell - xfail test_thread_safety: dict iteration races with concurrent GC mutations in _finalizer_registry
Summary by CodeRabbit
Bug Fixes
New Features
Other