Bytecode parity#7481
Conversation
📝 WalkthroughWalkthroughMultiple compiler and runtime components were enhanced: async/generator resume preamble handling, decorator application, class static attribute collection, dict literal optimization, constant-folding passes (unary negative, lists, sets), generator detection in symbol tables, JIT instruction support for StoreFastStoreFast, and generator frame initialization adjustments to prevent spurious trace events. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Sorry, something went wrong.
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: (module 'compile test_peepholer test_sys_settrace' not found) Legend:
|
Sorry, something went wrong.
Compiler changes:
- Remove PUSH_NULL from decorator cal
ls, use CALL 0
- Collect __static_attributes__ from self.xxx = patterns
- Sort __static_attributes__ alphabetically
- Move __classdict__ init before __doc__ in class prologue
- Fold unary negative constants
- Fold constant list/set literals (3+ elements)
- Use BUILD_MAP 0 + MAP_ADD for 16+ dict pairs
- Always run peephole optimizer for s
uperinstructions
- Emit RETURN_GENERATOR for generator
functions
- Add is_generator flag to SymbolTabl
e
- Replace irrefutable if-let with let for ExceptHandler - Remove folded UNARY_NEGATIVE instead of replacing with NOP, enabling chained negation folding - Initialize prev_line to def line for generators/coroutines to suppress spurious LINE events from preamble instructions - Remove expectedFailure markers for now-passing tests
- Add StoreFastStoreFast handling in JIT instructions - Fix cargo fmt in frame.rs - Remove 11 expectedFailure markers for async jump tests in test_sys_settrace that now pass
Using remove() shifts instruction indices and corrupts subsequent references, causing "pop stackref but null found" panics at runtime. Replace folded/combined instructions with NOP instead, which are cleaned up by the existing remove_nops pass.
6adb4fb to
9f952c7
Compare
March 25, 2026 02:58
NOP replacement broke chaining of peephole optimizations (e.g. LOAD_CONST+TO_BOOL then LOAD_CONST+UNARY_NOT for 'not True'). The remove() approach is used by upstream and works correctly here; fold_unary_negative keeps NOP replacement since it doesn't need chaining.
StoreFast uses pop_value_opt() to allow NULL values from LoadFastAndClear in inlined comprehension cleanup paths. StoreFastStoreFast must do the same, otherwise the peephole optimizer's fusion of two StoreFast instructions panics when restoring unbound locals after an inlined comprehension.
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/codegen/src/symboltable.rs (1)
1830-1840: ⚠️ Potential issue | 🟠 Major
<genexpr>never sets the new generator flag.The new flag is only written in the
Yield/YieldFromarms. Generator expressions take thescan_comprehension(..., true)path instead, and thatis_generatorparameter is never copied onto the comprehension scope, so any downstream consumer ofSymbolTable::is_generatorwill still see<genexpr>as a plain comprehension.Possible fix
fn scan_comprehension( &mut self, scope_name: &str, elt1: &ast::Expr, elt2: Option<&ast::Expr>, generators: &[ast::Comprehension], range: TextRange, is_generator: bool, ) -> SymbolTableResult { // Comprehensions are compiled as functions, so create a scope for them: self.enter_scope( scope_name, CompilerScope::Comprehension, self.line_index_start(range), ); + self.tables.last_mut().unwrap().is_generator = is_generator;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/symboltable.rs` around lines 1830 - 1840, Generator expressions created via scan_comprehension(..., true) never set the SymbolTable::is_generator flag; update the code that creates/completes a comprehension (the scan_comprehension path for genexpr) so that when its "is_generator" parameter is true it writes to the current scope's flag (e.g., call tables.last_mut().unwrap().is_generator = true inside scan_comprehension or copy the boolean into the new comprehension scope), mirroring how the Expr::Yield/Expr::YieldFrom arms set is_generator in scan_expression.
🧹 Nitpick comments (1)
crates/codegen/src/compile.rs (1)
2758-2769: Update the stale decorator stack comment.Line 2759 still documents
prepare_decoratorsas leaving NULL sentinels on the stack, but this path now relies on a plain[dec1, dec2, func]layout plusCALL 0. The code looks fine; the comment is now the misleading part.As per coding guidelines, "Do not delete or rewrite existing comments unless they are factually wrong or directly contradict the new code".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/codegen/src/compile.rs` around lines 2758 - 2769, The comment for prepare_decorators is stale: it mentions leaving NULL sentinels on the stack (e.g. [dec1, NULL, dec2, NULL]) but the current implementation of prepare_decorators and apply_decorators uses a plain [dec1, dec2, func] layout with CALL 0; update the doc comment above prepare_decorators (and the related example in apply_decorators) to describe the actual stack shape and transformation (e.g. push decorators in source order to produce [dec1, dec2, func] and then CALL 0 repeatedly to produce dec1(dec2(func))). Ensure you reference the functions prepare_decorators and apply_decorators so the updated comment is next to the correct code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 1203-1206: The async-generator check is using CodeFlags::GENERATOR
in compile_statement which runs before mark_generator(), so async defs that are
generators slip through; when you compute is_gen (using scope_type ==
CompilerScope::AsyncFunction || self.current_symbol_table().is_generator)
propagate that generator classification into the code flags immediately (or
alternatively change the later check in compile_statement to consult
self.current_symbol_table().is_generator instead of CodeFlags::GENERATOR).
Concretely, set the generator bit on the current CodeFlags/state when is_gen is
true (or update the Line 2498 check to use current_symbol_table().is_generator)
so the async return-value guard sees the generator status before
mark_generator() is called.
- Around line 4512-4597: The collector misses methods with a pos-only first
parameter and doesn't descend into match statements; update
collect_static_attributes to pick the first param by checking
params.posonlyargs.first() before params.args.first() (or otherwise consider
both lists) when computing self_name in collect_static_attributes, and extend
scan_store_attrs to handle the match statement AST node (iterate each match
case/body and scan them) so assignments inside match arms are discovered; refer
to the functions collect_static_attributes and scan_store_attrs to locate where
to change the parameter selection and add the new match-arm recursion.
In `@crates/codegen/src/ir.rs`:
- Around line 192-196: The unary-negative folding pass (fold_unary_negative)
currently leaves a NOP after consuming UNARY_NEGATIVE which prevents
fold_tuple_constants/fold_list_constants/fold_set_constants from recognizing
adjacent LOAD_CONSTs; change fold_unary_negative to emit a real LOAD_CONST with
the negated value (i.e., replace the original LOAD_CONST + UNARY_NEGATIVE
sequence with a single LOAD_CONST of the negative literal) instead of leaving a
NOP so later BUILD_* collection folds will see contiguous LOAD_CONSTs; apply the
same fix where the same pattern appears around the other occurrences noted
(lines ~671-677) so all collection-folding passes work correctly.
In `@crates/vm/src/frame.rs`:
- Around line 708-718: The code incorrectly seeds prev_line with
code.first_line_number when code.flags contains CodeFlags::GENERATOR or
CodeFlags::COROUTINE, which suppresses the first real LINE event for single-line
generator/coroutine bodies; instead, remove the priming of prev_line and
explicitly suppress the preamble opcodes (e.g., RETURN_GENERATOR and POP_TOP)
when emitting LINE events by checking the opcode during event emission rather
than setting prev_line based on code.first_line_number; update the
initialization around prev_line and the LINE-event emission logic to skip those
specific preamble opcodes while leaving prev_line at 0 so real first-line events
are not lost.
---
Outside diff comments:
In `@crates/codegen/src/symboltable.rs`:
- Around line 1830-1840: Generator expressions created via
scan_comprehension(..., true) never set the SymbolTable::is_generator flag;
update the code that creates/completes a comprehension (the scan_comprehension
path for genexpr) so that when its "is_generator" parameter is true it writes to
the current scope's flag (e.g., call tables.last_mut().unwrap().is_generator =
true inside scan_comprehension or copy the boolean into the new comprehension
scope), mirroring how the Expr::Yield/Expr::YieldFrom arms set is_generator in
scan_expression.
---
Nitpick comments:
In `@crates/codegen/src/compile.rs`:
- Around line 2758-2769: The comment for prepare_decorators is stale: it
mentions leaving NULL sentinels on the stack (e.g. [dec1, NULL, dec2, NULL]) but
the current implementation of prepare_decorators and apply_decorators uses a
plain [dec1, dec2, func] layout with CALL 0; update the doc comment above
prepare_decorators (and the related example in apply_decorators) to describe the
actual stack shape and transformation (e.g. push decorators in source order to
produce [dec1, dec2, func] and then CALL 0 repeatedly to produce
dec1(dec2(func))). Ensure you reference the functions prepare_decorators and
apply_decorators so the updated comment is next to the correct code.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: ea6cba1a-c905-4bf2-831a-5edbac896928
⛔ Files ignored due to path filters (7)
Lib/test/test_compile.pyis excluded by!Lib/**Lib/test/test_peepholer.pyis excluded by!Lib/**Lib/test/test_sys_settrace.pyis excluded by!Lib/**crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ands.snapis excluded by!**/*.snapcrates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_mixed.snapis excluded by!**/*.snapcrates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ors.snapis excluded by!**/*.snapcrates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snapis excluded by!**/*.snap
📒 Files selected for processing (5)
crates/codegen/src/compile.rscrates/codegen/src/ir.rscrates/codegen/src/symboltable.rscrates/jit/src/instructions.rscrates/vm/src/frame.rs
Sorry, something went wrong.
ea5a6cd
into
RustPython:main
Mar 25, 2026
- Set is_generator flag for generator expressions in scan_comprehension - Fix posonlyargs priority in collect_static_attributes first param - Add match statement support to scan_store_attrs - Fix stale decorator stack comment - Reorder NOP removal after fold_unary_negative for better collection folding
- Set is_generator flag for generator expressions in scan_comprehension - Fix posonlyargs priority in collect_static_attributes first param - Add match statement support to scan_store_attrs - Fix stale decorator stack comment - Reorder NOP removal after fold_unary_negative for better collection folding
- Set is_generator flag for generator expressions in scan_comprehension - Fix posonlyargs priority in collect_static_attributes first param - Add match statement support to scan_store_attrs - Fix stale decorator stack comment - Reorder NOP removal after fold_unary_negative for better collection folding
* Match CPython LOAD_SPECIAL stack semantics for with/async-with LOAD_SPECIAL now pushes (callable, self_or_null) matching CPython's CALL convention, instead of a single bound method: - Function descriptors: push (func, self) - Plain attributes: push (bound, NULL) Updated all with-statement paths: - Entry: add SWAP 3 after SWAP 2, remove PUSH_NULL before CALL 0 - Normal exit: remove PUSH_NULL before CALL 3 - Exception handler (WITH_EXCEPT_START): read exit_func at TOS-4 and self_or_null at TOS-3 - Suppress block: 3 POP_TOPs after POP_EXCEPT (was 2) - FBlock exit (preserve_tos): SWAP 3 + SWAP 2 rotation - UnwindAction::With: remove PUSH_NULL Stack effects updated: LoadSpecial (2,1), WithExceptStart (7,6) * Normalize LOAD_FAST_CHECK and JUMP_BACKWARD_NO_INTERRUPT Add LOAD_FAST_CHECK → LOAD_FAST and JUMP_BACKWARD_NO_INTERRUPT → JUMP_BACKWARD to opname normalization in dis_dump.py. These are optimization variants with identical semantics. * Add EXTENDED_ARG to SKIP_OPS, normalize LOAD_FAST_CHECK and JUMP_BACKWARD_NO_INTERRUPT * Remove duplicate return-None when block already has return Skip duplicate_end_returns for blocks that already end with LOAD_CONST + RETURN_VALUE. Run DCE + unreachable elimination after duplication to remove the now-unreachable original return block. * Improve __static_attributes__ collection accuracy - Support tuple/list unpacking targets: (self.x, self.y) = val - Skip @staticmethod and @classmethod decorated methods - Use scan_target_for_attrs helper for recursive target scanning * Use method mode for function-local import attribute calls Function-local imports (scope is Local+IMPORTED) should use method mode LOAD_ATTR like regular names, not plain mode. Only module/class scope imports use plain LOAD_ATTR + PUSH_NULL. * Optimize constant iterable before GET_ITER to LOAD_CONST tuple Convert BUILD_LIST/SET 0 + LOAD_CONST + LIST_EXTEND/SET_UPDATE + GET_ITER to just LOAD_CONST (tuple) + GET_ITER, matching CPython's optimization for constant list/set literals in for-loop iterables. Also fix is_name_imported to use method mode for function-local imports, and improve __static_attributes__ accuracy (skip @classmethod/@staticmethod, handle tuple/list unpacking targets). * Fix cell variable ordering: parameters first, then alphabetical CPython orders cell variables with parameter cells first (in parameter definition order), then non-parameter cells sorted alphabetically. Previously all cells were sorted alphabetically. Also add for-loop iterable optimization: constant BUILD_LIST/SET before GET_ITER is folded to just LOAD_CONST tuple. * Emit COPY_FREE_VARS before MAKE_CELL matching CPython order CPython emits COPY_FREE_VARS first, then MAKE_CELL instructions. Previously RustPython emitted them in reverse order. * Fix RESUME AfterYield encoding to match CPython 3.14 (value 5) CPython 3.14 uses RESUME arg=5 for after-yield, not 1. Also reorder COPY_FREE_VARS before MAKE_CELL and fix cell variable ordering (parameters first, then alphabetical). * Address code review feedback from #7481 - Set is_generator flag for generator expressions in scan_comprehension - Fix posonlyargs priority in collect_static_attributes first param - Add match statement support to scan_store_attrs - Fix stale decorator stack comment - Reorder NOP removal after fold_unary_negative for better collection folding * Fold constant list/set/tuple literals in compiler When all elements of a list/set/tuple literal are constants and there are 3+ elements, fold them into a single constant: - list: BUILD_LIST 0 + LOAD_CONST (tuple) + LIST_EXTEND 1 - set: BUILD_SET 0 + LOAD_CONST (tuple) + SET_UPDATE 1 - tuple: LOAD_CONST (tuple) This matches CPython's compiler optimization and fixes the most common bytecode difference (92/200 sampled files). Also add bytecode comparison scripts (dis_dump.py, compare_bytecode.py) for systematic parity tracking. * Use BUILD_MAP 0 + MAP_ADD for large dicts (>= 16 pairs) Match CPython's compiler behavior: dicts with 16+ key-value pairs use BUILD_MAP 0 followed by MAP_ADD for each pair, instead of pushing all keys/values on the stack and calling BUILD_MAP N. * Fix clippy warnings and cargo fmt * fix surrogate
Compiler changes:
- Remove PUSH_NULL from decorator cal ls, use CALL 0
- Collect static_attributes from self.xxx = patterns
- Sort static_attributes alphabetically
- Move classdict init before doc in class prologue
- Fold unary negative constants
- Fold constant list/set literals (3+ elements)
- Use BUILD_MAP 0 + MAP_ADD for 16+ dict pairs
- Always run peephole optimizer for s uperinstructions
- Emit RETURN_GENERATOR for generator functions
- Add is_generator flag to SymbolTabl e
Summary by CodeRabbit
Bug Fixes
Performance Improvements