◐ Shell
reader mode source ↗
Skip to content

Add GC infrastructure: object tracking, tp_clear, and helper methods#6994

Merged
youknowone merged 1 commit into
RustPython:mainfrom
youknowone:gc-infra
Feb 4, 2026
Merged

Add GC infrastructure: object tracking, tp_clear, and helper methods#6994
youknowone merged 1 commit into
RustPython:mainfrom
youknowone:gc-infra

Conversation

@youknowone

@youknowone youknowone commented Feb 4, 2026

Copy link
Copy Markdown
Member

Summary by CodeRabbit

  • Bug Fixes

    • Corrected garbage collection enablement and threshold handling so collections trigger reliably when limits are reached.
  • Refactor

    • Reworked finalization and weak-reference handling to improve memory-safety during object cleanup and reduce risks of use-after-free or missed callbacks.
  • Chores

    • Added a new entry to the spell-check dictionary.

@coderabbitai

coderabbitai Bot commented Feb 4, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Adds GC-aware finalization and weakref-clearing APIs on PyObject, changes weakref clearing to collect callbacks for deferred invocation, and updates GC state to enforce enablement and reset gen0 counters; also adds a spell-check dictionary token.

Changes

Cohort / File(s) Summary
Spell-check Dictionary
\.cspell.dict/cpython.txt
Added token finalizers.
GC state
crates/vm/src/gc_state.rs
maybe_collect now honors GC enablement and gen0 threshold; collect and collect_force reset gen0 count (placeholders for main collection path).
Object finalization & weakrefs
crates/vm/src/object/core.rs
Reworked weakref clearing to gather callbacks during traversal and invoke them outside locks. Added GC-oriented PyObject APIs: try_call_finalizer, gc_clear_weakrefs_collect_callbacks, gc_get_referent_ptrs, gc_clear_raw, gc_clear, gc_has_clear; added clear_for_gc_collect_callbacks on WeakRefList.

Sequence Diagram(s)

sequenceDiagram
    participant GC as Garbage Collector
    participant Obj as PyObject
    participant WRL as WeakRefList
    participant CB as Callback

    GC->>Obj: gc_clear_weakrefs_collect_callbacks()
    Obj->>WRL: clear_for_gc_collect_callbacks()
    WRL->>WRL: Unlink weakrefs, collect (weakref, callback) pairs
    WRL-->>Obj: return pairs
    Obj-->>GC: Vec<(PyWeak, PyObjectRef)>

    GC->>Obj: try_call_finalizer()
    Obj->>CB: invoke __del__
    CB-->>Obj: returns (object may be resurrected)
    Obj-->>GC: finalizer result

    GC->>Obj: gc_clear()
    Obj->>Obj: tp_clear / clear member slots (extract refs)
    Obj-->>GC: extracted references

    GC->>CB: invoke deferred callbacks (outside locks)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh

"I nudge the heap with whiskers bright,
I gather weakrefs in the night,
Finalizers hum, callbacks queue,
The collector hops — and so do you! 🐇"

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: GC infrastructure additions including object tracking, tp_clear implementation, and new helper methods across multiple files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone force-pushed the gc-infra branch 2 times, most recently from a1a8488 to 2766770 Compare February 4, 2026 06:53
@youknowone youknowone marked this pull request as ready for review February 4, 2026 07:51

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@crates/vm/src/object/core.rs`:
- Around line 1151-1170: In try_call_finalizer, guard against
double-finalization by checking the FINALIZED flag before marking and invoking
__del__: if self.is_gc_finalized() return early; otherwise call
self.set_gc_finalized() and proceed to call the del slot (class().slots.del) and
handle errors with vm.run_unraisable as before; this ensures repeated calls to
try_call_finalizer won't invoke __del__ twice.
- Around line 1217-1224: In gc_clear_raw, the slot-clearing loop uses
non-blocking slot.try_write() which may skip clearing slots; replace try_write()
with the blocking write() (or otherwise unwrap/panic on failure) when iterating
obj.0.slots so each slot is deterministically cleared (matching the set_slot
pattern); ensure you still take() the guard and push the value into result, and
keep the exclusive-access assumption from gc_clear_raw intact.

Object tracking (CPython _PyObject_GC_TRACK/UNTRACK equivalent):
- PyRef::new_ref() now calls gc_state().track_object() for traceable objects
- track_object() sets GcBits::TRACKED flag via set_gc_tracked()
- Static types call clear_gc_tracked() since they are immortal
- is_gc_tracked() now checks the TRACKED bit flag

tp_clear infrastructure for breaking reference cycles:
- Add clear_for_gc_collect_callbacks() to WeakRefList (CPython: handle_weakrefs)
- Add try_call_finalizer() for GC to call __del__ (CPython: PyObject_CallFinalizerFromDealloc)
- Add gc_clear_raw()/gc_clear() for tp_clear (CPython: subtype_clear)
- Add gc_get_referent_ptrs() for raw pointer traversal
- Add gc_has_clear() to check clear capability

GC state improvements:
- maybe_collect() now checks gen0 threshold (CPython: _PyObject_GC_Alloc)
- collect()/collect_force() reset gen0 count

WeakRefList::clear() cleanup:
- Drop lock before invoking callbacks to avoid deadlock

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/vm/src/object/core.rs`:
- Around line 1210-1247: gc_clear_raw currently clears vtable-held refs and
slots but never clears the instance dict even though gc_has_clear() returns true
when self.0.dict.is_some(); add a dict-clearing step in gc_clear_raw that takes
the dict out of obj.0.dict (e.g. use take() or replace(None) on obj.0.dict),
push the removed dict into the result Vec as a PyObjectRef, and thereby break
cycles involving instance dict entries; alternatively, if dicts should not be
cleared here, remove the dict.is_some() check from gc_has_clear() instead.
🧹 Nitpick comments (1)
crates/vm/src/object/core.rs (1)

437-479: Consider extracting shared logic with clear() to reduce duplication.

The body of clear_for_gc_collect_callbacks is nearly identical to clear() (lines 389-425). Both iterate the list, mark weakrefs as dead, unlink nodes, and collect callbacks. The only difference is whether callbacks are invoked immediately or returned.

A helper method could consolidate the common traversal logic:

fn clear_and_collect_callbacks(&self, obj: &PyObject) -> Vec<(PyRef<PyWeak>, PyObjectRef)> {
    // ... shared traversal logic ...
}

Then clear() would call this helper and invoke callbacks, while clear_for_gc_collect_callbacks() would just return them.

This is optional since the current implementation is correct and maintainable.

Hide details View details @youknowone youknowone merged commit 0dcc975 into RustPython:main Feb 4, 2026
13 checks passed
@youknowone youknowone deleted the gc-infra branch February 4, 2026 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant