Align specialization guards and caching with CPython by youknowone · Pull Request #7341 · RustPython/RustPython

youknowone

youknowone changed the title ~~Specialization~~ Align specialization guards and caching with CPython

Mar 4, 2026

- Disable ForIterGen specialization (falls through to generic
  path) because inline generator frame resumption is needed
  for correct debugger StopIteration visibility (test_bdb)
- Use downcast_ref_if_exact for PyNativeFunction in CALL
  specialization guards
- Add can_specialize_call guard for class __init__ specialization
- Remove expectedFailure for test_bad_newobj_args (now passing)

Introduce FastLocalsData enum with Heap and DataStack variants
so non-generator/coroutine frames allocate localsplus on the VM
datastack instead of the heap. Includes materialize_to_heap for
migration when needed (e.g. generator suspension).

Move key_eq call outside the read lock guard to avoid
potential deadlock when Python __eq__ re-enters dict
mutation paths. Matches the existing pattern in lookup().

This was referenced

Mar 5, 2026

youknowone added a commit to youknowone/RustPython that referenced this pull request

Mar 22, 2026

* vm: complete specialized opcode dispatch paths

* vm: cache LOAD_GLOBAL with dict entry hints

* vm: align adaptive specialization counters with CPython backoff

* vm: apply cooldown counter on specialization success paths

* vm: retain LOAD_GLOBAL specializations on misses

* vm: keep attr and call specializations on guard misses

* vm: retain store-attr and store-subscr specializations on misses

* vm: retain specialization opcodes on generic fallback paths

* vm: align jump-backward specialization defaults with CPython

* vm: retain exact-args call specializations on misses

* vm: retain SEND_GEN specialization on non-coroutine sends

* vm: specialize list.append calls like CPython CALL_LIST_APPEND

* vm: set cooldown on LOAD_ATTR_CLASS specialization

* vm: specialize bound method object CALL paths

* vm: specialize CALL_KW for bound method objects

* vm: use current-state function version for CALL_KW specialization

* vm: align CALL/CALL_KW pyfunction specialization with CPython

* vm: drop call-site identity caches in generic CALL specializations

* vm: align builtin type call specializations with CPython guards

* vm: align builtin CALL guards with CPython self_or_null semantics

* vm: require exact list in CALL_LIST_APPEND fast path

* vm: align CALL builtin/class specialization flow with CPython

* vm: tighten len/isinstance CALL specializations to builtin guards

* vm: gate CALL_BUILTIN_CLASS on type vectorcall like CPython

* vm: run non-py CALL specializations via direct vectorcall

* vm: align class-call specialization branching with CPython

* Fix CI: disable ForIterGen, tighten CALL guards

- Disable ForIterGen specialization (falls through to generic
  path) because inline generator frame resumption is needed
  for correct debugger StopIteration visibility (test_bdb)
- Use downcast_ref_if_exact for PyNativeFunction in CALL
  specialization guards
- Add can_specialize_call guard for class __init__ specialization
- Remove expectedFailure for test_bad_newobj_args (now passing)