◐ Shell
reader mode source ↗
Skip to content

Bytecode parity#7481

Merged
youknowone merged 7 commits into
RustPython:mainfrom
youknowone:bytecode-parity-phase1
Mar 25, 2026
Merged

Bytecode parity#7481
youknowone merged 7 commits into
RustPython:mainfrom
youknowone:bytecode-parity-phase1

Conversation

@youknowone

@youknowone youknowone commented Mar 23, 2026

Copy link
Copy Markdown
Member

Compiler changes:
- Remove PUSH_NULL from decorator cal ls, use CALL 0
- Collect static_attributes from self.xxx = patterns
- Sort static_attributes alphabetically
- Move classdict init before doc in class prologue
- Fold unary negative constants
- Fold constant list/set literals (3+ elements)
- Use BUILD_MAP 0 + MAP_ADD for 16+ dict pairs
- Always run peephole optimizer for s uperinstructions
- Emit RETURN_GENERATOR for generator functions
- Add is_generator flag to SymbolTabl e

Summary by CodeRabbit

  • Bug Fixes

    • Fixed spurious trace events in generator and coroutine functions.
  • Performance Improvements

    • Optimized compilation of decorators, class attributes, constants, and dictionary literals to reduce memory usage and execution overhead.

@coderabbitai

coderabbitai Bot commented Mar 23, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Multiple compiler and runtime components were enhanced: async/generator resume preamble handling, decorator application, class static attribute collection, dict literal optimization, constant-folding passes (unary negative, lists, sets), generator detection in symbol tables, JIT instruction support for StoreFastStoreFast, and generator frame initialization adjustments to prevent spurious trace events.

Changes

Cohort / File(s) Summary
Compiler codegen – Resume and decorators
crates/codegen/src/compile.rs
Updated ReturnGenerator + PopTop emission to apply to both async and generator scopes (via is_generator check). Reworked decorator compilation: removed PushNull after each decorator, changed apply_decorators to emit Call { argc: 0 } per iteration instead of argc: 1.
Compiler codegen – Class static attributes
crates/codegen/src/compile.rs
Introduced collect_static_attributes and scan_store_attrs helpers to scan class bodies for self-attribute assignments in control-flow blocks. Integrated collection into compile_class_body and emitted deterministic __static_attributes__ tuple. Adjusted __classdict__ initialization ordering above __doc__ storage.
Compiler codegen – Dict optimization
crates/codegen/src/compile.rs
Optimized large dict literal generation: switched from BuildMap { count: len } to incremental BuildMap { count: 0 } + repeated MapAdd to reduce stack usage.
IR constant folding
crates/codegen/src/ir.rs
Added three new constant-folding passes: fold_unary_negative() (negates constants), fold_list_constants() (folds constant lists into tuple + LIST_EXTEND), fold_set_constants() (folds constant sets into tuple + SET_UPDATE). Reordered finalization pipeline and ensured peephole_optimize() always runs.
Symbol table – Generator detection
crates/codegen/src/symboltable.rs
Added pub is_generator: bool field to SymbolTable, initialized to false. Updated SymbolTableBuilder to mark scopes as generators when yield or yield from expressions are encountered.
JIT instruction support
crates/jit/src/instructions.rs
Added handling for StoreFastStoreFast { var_nums } bytecode: decodes two variable indices, pops two stack values, and stores each to corresponding local variables.
VM frame – Generator initialization and StoreFastStoreFast
crates/vm/src/frame.rs
Generator/coroutine frames now initialize prev_line to code object's first_line_number (not 0) to prevent spurious LINE trace events from preamble bytecode. Updated StoreFastStoreFast to use pop_value_opt() for both values, allowing NULL values from LoadFastAndClear to propagate correctly.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • ShaharNaveh

🐰 Generators now know they're generators, decorators dance with cleaner stacks,
Classes collect their attributes with care, constants fold in clever ways,
And dicts grow gracefully—no excess burden, just elegant compilation!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Bytecode parity' is vague and generic, using a non-descriptive term that doesn't clearly convey what specific changes were made, despite the PR implementing multiple significant compiler optimizations and bytecode changes. Consider using a more specific title that highlights the main changes, such as 'Compiler optimizations: decorator calls, static attributes, and constant folding' or 'Improve bytecode generation with decorator call optimization and constant folding'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Mar 23, 2026

Copy link
Copy Markdown
Contributor

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

(module 'compile test_peepholer test_sys_settrace' not found)

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

 Compiler changes:
    - Remove PUSH_NULL from decorator cal
ls, use CALL 0
    - Collect __static_attributes__ from self.xxx = patterns
    - Sort __static_attributes__ alphabetically
    - Move __classdict__ init before __doc__ in class prologue
    - Fold unary negative constants
    - Fold constant list/set literals (3+ elements)
    - Use BUILD_MAP 0 + MAP_ADD for 16+ dict pairs
    - Always run peephole optimizer for s
uperinstructions
    - Emit RETURN_GENERATOR for generator
 functions
    - Add is_generator flag to SymbolTabl
e
- Replace irrefutable if-let with let for ExceptHandler
- Remove folded UNARY_NEGATIVE instead of replacing with NOP,
  enabling chained negation folding
- Initialize prev_line to def line for generators/coroutines
  to suppress spurious LINE events from preamble instructions
- Remove expectedFailure markers for now-passing tests
- Add StoreFastStoreFast handling in JIT instructions
- Fix cargo fmt in frame.rs
- Remove 11 expectedFailure markers for async jump tests in
  test_sys_settrace that now pass
Using remove() shifts instruction indices and corrupts subsequent
references, causing "pop stackref but null found" panics at runtime.
Replace folded/combined instructions with NOP instead, which are
cleaned up by the existing remove_nops pass.
@youknowone youknowone force-pushed the bytecode-parity-phase1 branch from 6adb4fb to 9f952c7 Compare March 25, 2026 02:58
NOP replacement broke chaining of peephole optimizations (e.g.
LOAD_CONST+TO_BOOL then LOAD_CONST+UNARY_NOT for 'not True').
The remove() approach is used by upstream and works correctly here;
fold_unary_negative keeps NOP replacement since it doesn't need chaining.
StoreFast uses pop_value_opt() to allow NULL values from
LoadFastAndClear in inlined comprehension cleanup paths.
StoreFastStoreFast must do the same, otherwise the peephole
optimizer's fusion of two StoreFast instructions panics when
restoring unbound locals after an inlined comprehension.
@youknowone youknowone marked this pull request as ready for review March 25, 2026 06:48

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/codegen/src/symboltable.rs (1)

1830-1840: ⚠️ Potential issue | 🟠 Major

<genexpr> never sets the new generator flag.

The new flag is only written in the Yield/YieldFrom arms. Generator expressions take the scan_comprehension(..., true) path instead, and that is_generator parameter is never copied onto the comprehension scope, so any downstream consumer of SymbolTable::is_generator will still see <genexpr> as a plain comprehension.

Possible fix
 fn scan_comprehension(
     &mut self,
     scope_name: &str,
     elt1: &ast::Expr,
     elt2: Option<&ast::Expr>,
     generators: &[ast::Comprehension],
     range: TextRange,
     is_generator: bool,
 ) -> SymbolTableResult {
     // Comprehensions are compiled as functions, so create a scope for them:
     self.enter_scope(
         scope_name,
         CompilerScope::Comprehension,
         self.line_index_start(range),
     );
+    self.tables.last_mut().unwrap().is_generator = is_generator;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/symboltable.rs` around lines 1830 - 1840, Generator
expressions created via scan_comprehension(..., true) never set the
SymbolTable::is_generator flag; update the code that creates/completes a
comprehension (the scan_comprehension path for genexpr) so that when its
"is_generator" parameter is true it writes to the current scope's flag (e.g.,
call tables.last_mut().unwrap().is_generator = true inside scan_comprehension or
copy the boolean into the new comprehension scope), mirroring how the
Expr::Yield/Expr::YieldFrom arms set is_generator in scan_expression.
🧹 Nitpick comments (1)
crates/codegen/src/compile.rs (1)

2758-2769: Update the stale decorator stack comment.

Line 2759 still documents prepare_decorators as leaving NULL sentinels on the stack, but this path now relies on a plain [dec1, dec2, func] layout plus CALL 0. The code looks fine; the comment is now the misleading part.

As per coding guidelines, "Do not delete or rewrite existing comments unless they are factually wrong or directly contradict the new code".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/codegen/src/compile.rs` around lines 2758 - 2769, The comment for
prepare_decorators is stale: it mentions leaving NULL sentinels on the stack
(e.g. [dec1, NULL, dec2, NULL]) but the current implementation of
prepare_decorators and apply_decorators uses a plain [dec1, dec2, func] layout
with CALL 0; update the doc comment above prepare_decorators (and the related
example in apply_decorators) to describe the actual stack shape and
transformation (e.g. push decorators in source order to produce [dec1, dec2,
func] and then CALL 0 repeatedly to produce dec1(dec2(func))). Ensure you
reference the functions prepare_decorators and apply_decorators so the updated
comment is next to the correct code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 1203-1206: The async-generator check is using CodeFlags::GENERATOR
in compile_statement which runs before mark_generator(), so async defs that are
generators slip through; when you compute is_gen (using scope_type ==
CompilerScope::AsyncFunction || self.current_symbol_table().is_generator)
propagate that generator classification into the code flags immediately (or
alternatively change the later check in compile_statement to consult
self.current_symbol_table().is_generator instead of CodeFlags::GENERATOR).
Concretely, set the generator bit on the current CodeFlags/state when is_gen is
true (or update the Line 2498 check to use current_symbol_table().is_generator)
so the async return-value guard sees the generator status before
mark_generator() is called.
- Around line 4512-4597: The collector misses methods with a pos-only first
parameter and doesn't descend into match statements; update
collect_static_attributes to pick the first param by checking
params.posonlyargs.first() before params.args.first() (or otherwise consider
both lists) when computing self_name in collect_static_attributes, and extend
scan_store_attrs to handle the match statement AST node (iterate each match
case/body and scan them) so assignments inside match arms are discovered; refer
to the functions collect_static_attributes and scan_store_attrs to locate where
to change the parameter selection and add the new match-arm recursion.

In `@crates/codegen/src/ir.rs`:
- Around line 192-196: The unary-negative folding pass (fold_unary_negative)
currently leaves a NOP after consuming UNARY_NEGATIVE which prevents
fold_tuple_constants/fold_list_constants/fold_set_constants from recognizing
adjacent LOAD_CONSTs; change fold_unary_negative to emit a real LOAD_CONST with
the negated value (i.e., replace the original LOAD_CONST + UNARY_NEGATIVE
sequence with a single LOAD_CONST of the negative literal) instead of leaving a
NOP so later BUILD_* collection folds will see contiguous LOAD_CONSTs; apply the
same fix where the same pattern appears around the other occurrences noted
(lines ~671-677) so all collection-folding passes work correctly.

In `@crates/vm/src/frame.rs`:
- Around line 708-718: The code incorrectly seeds prev_line with
code.first_line_number when code.flags contains CodeFlags::GENERATOR or
CodeFlags::COROUTINE, which suppresses the first real LINE event for single-line
generator/coroutine bodies; instead, remove the priming of prev_line and
explicitly suppress the preamble opcodes (e.g., RETURN_GENERATOR and POP_TOP)
when emitting LINE events by checking the opcode during event emission rather
than setting prev_line based on code.first_line_number; update the
initialization around prev_line and the LINE-event emission logic to skip those
specific preamble opcodes while leaving prev_line at 0 so real first-line events
are not lost.

---

Outside diff comments:
In `@crates/codegen/src/symboltable.rs`:
- Around line 1830-1840: Generator expressions created via
scan_comprehension(..., true) never set the SymbolTable::is_generator flag;
update the code that creates/completes a comprehension (the scan_comprehension
path for genexpr) so that when its "is_generator" parameter is true it writes to
the current scope's flag (e.g., call tables.last_mut().unwrap().is_generator =
true inside scan_comprehension or copy the boolean into the new comprehension
scope), mirroring how the Expr::Yield/Expr::YieldFrom arms set is_generator in
scan_expression.

---

Nitpick comments:
In `@crates/codegen/src/compile.rs`:
- Around line 2758-2769: The comment for prepare_decorators is stale: it
mentions leaving NULL sentinels on the stack (e.g. [dec1, NULL, dec2, NULL]) but
the current implementation of prepare_decorators and apply_decorators uses a
plain [dec1, dec2, func] layout with CALL 0; update the doc comment above
prepare_decorators (and the related example in apply_decorators) to describe the
actual stack shape and transformation (e.g. push decorators in source order to
produce [dec1, dec2, func] and then CALL 0 repeatedly to produce
dec1(dec2(func))). Ensure you reference the functions prepare_decorators and
apply_decorators so the updated comment is next to the correct code.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: ea6cba1a-c905-4bf2-831a-5edbac896928

📥 Commits

Reviewing files that changed from the base of the PR and between 82432be and 289172a.

⛔ Files ignored due to path filters (7)
  • Lib/test/test_compile.py is excluded by !Lib/**
  • Lib/test/test_peepholer.py is excluded by !Lib/**
  • Lib/test/test_sys_settrace.py is excluded by !Lib/**
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ands.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_mixed.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__if_ors.snap is excluded by !**/*.snap
  • crates/codegen/src/snapshots/rustpython_codegen__compile__tests__nested_double_async_with.snap is excluded by !**/*.snap
📒 Files selected for processing (5)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • crates/codegen/src/symboltable.rs
  • crates/jit/src/instructions.rs
  • crates/vm/src/frame.rs

Hide details View details @youknowone youknowone merged commit ea5a6cd into RustPython:main Mar 25, 2026
19 checks passed
@youknowone youknowone deleted the bytecode-parity-phase1 branch March 25, 2026 07:10
youknowone added a commit to youknowone/RustPython that referenced this pull request Mar 25, 2026
- Set is_generator flag for generator expressions in scan_comprehension
- Fix posonlyargs priority in collect_static_attributes first param
- Add match statement support to scan_store_attrs
- Fix stale decorator stack comment
- Reorder NOP removal after fold_unary_negative for better collection folding
youknowone added a commit to youknowone/RustPython that referenced this pull request Mar 25, 2026
- Set is_generator flag for generator expressions in scan_comprehension
- Fix posonlyargs priority in collect_static_attributes first param
- Add match statement support to scan_store_attrs
- Fix stale decorator stack comment
- Reorder NOP removal after fold_unary_negative for better collection folding
youknowone added a commit to youknowone/RustPython that referenced this pull request Mar 25, 2026
- Set is_generator flag for generator expressions in scan_comprehension
- Fix posonlyargs priority in collect_static_attributes first param
- Add match statement support to scan_store_attrs
- Fix stale decorator stack comment
- Reorder NOP removal after fold_unary_negative for better collection folding
@coderabbitai coderabbitai Bot mentioned this pull request Mar 25, 2026
youknowone added a commit that referenced this pull request Mar 25, 2026
* Match CPython LOAD_SPECIAL stack semantics for with/async-with

LOAD_SPECIAL now pushes (callable, self_or_null) matching CPython's
CALL convention, instead of a single bound method:
- Function descriptors: push (func, self)
- Plain attributes: push (bound, NULL)

Updated all with-statement paths:
- Entry: add SWAP 3 after SWAP 2, remove PUSH_NULL before CALL 0
- Normal exit: remove PUSH_NULL before CALL 3
- Exception handler (WITH_EXCEPT_START): read exit_func at TOS-4
  and self_or_null at TOS-3
- Suppress block: 3 POP_TOPs after POP_EXCEPT (was 2)
- FBlock exit (preserve_tos): SWAP 3 + SWAP 2 rotation
- UnwindAction::With: remove PUSH_NULL

Stack effects updated: LoadSpecial (2,1), WithExceptStart (7,6)

* Normalize LOAD_FAST_CHECK and JUMP_BACKWARD_NO_INTERRUPT

Add LOAD_FAST_CHECK → LOAD_FAST and JUMP_BACKWARD_NO_INTERRUPT →
JUMP_BACKWARD to opname normalization in dis_dump.py. These are
optimization variants with identical semantics.

* Add EXTENDED_ARG to SKIP_OPS, normalize LOAD_FAST_CHECK and JUMP_BACKWARD_NO_INTERRUPT

* Remove duplicate return-None when block already has return

Skip duplicate_end_returns for blocks that already end with
LOAD_CONST + RETURN_VALUE. Run DCE + unreachable elimination
after duplication to remove the now-unreachable original return
block.

* Improve __static_attributes__ collection accuracy

- Support tuple/list unpacking targets: (self.x, self.y) = val
- Skip @staticmethod and @classmethod decorated methods
- Use scan_target_for_attrs helper for recursive target scanning

* Use method mode for function-local import attribute calls

Function-local imports (scope is Local+IMPORTED) should use method
mode LOAD_ATTR like regular names, not plain mode. Only module/class
scope imports use plain LOAD_ATTR + PUSH_NULL.

* Optimize constant iterable before GET_ITER to LOAD_CONST tuple

Convert BUILD_LIST/SET 0 + LOAD_CONST + LIST_EXTEND/SET_UPDATE + GET_ITER
to just LOAD_CONST (tuple) + GET_ITER, matching CPython's optimization
for constant list/set literals in for-loop iterables.

Also fix is_name_imported to use method mode for function-local imports,
and improve __static_attributes__ accuracy (skip @classmethod/@staticmethod,
handle tuple/list unpacking targets).

* Fix cell variable ordering: parameters first, then alphabetical

CPython orders cell variables with parameter cells first (in
parameter definition order), then non-parameter cells sorted
alphabetically. Previously all cells were sorted alphabetically.

Also add for-loop iterable optimization: constant BUILD_LIST/SET
before GET_ITER is folded to just LOAD_CONST tuple.

* Emit COPY_FREE_VARS before MAKE_CELL matching CPython order

CPython emits COPY_FREE_VARS first, then MAKE_CELL instructions.
Previously RustPython emitted them in reverse order.

* Fix RESUME AfterYield encoding to match CPython 3.14 (value 5)

CPython 3.14 uses RESUME arg=5 for after-yield, not 1.
Also reorder COPY_FREE_VARS before MAKE_CELL and fix cell
variable ordering (parameters first, then alphabetical).

* Address code review feedback from #7481

- Set is_generator flag for generator expressions in scan_comprehension
- Fix posonlyargs priority in collect_static_attributes first param
- Add match statement support to scan_store_attrs
- Fix stale decorator stack comment
- Reorder NOP removal after fold_unary_negative for better collection folding

* Fold constant list/set/tuple literals in compiler

When all elements of a list/set/tuple literal are constants and
there are 3+ elements, fold them into a single constant:
- list: BUILD_LIST 0 + LOAD_CONST (tuple) + LIST_EXTEND 1
- set:  BUILD_SET 0  + LOAD_CONST (tuple) + SET_UPDATE 1
- tuple: LOAD_CONST (tuple)

This matches CPython's compiler optimization and fixes the most
common bytecode difference (92/200 sampled files).

Also add bytecode comparison scripts (dis_dump.py, compare_bytecode.py)
for systematic parity tracking.

* Use BUILD_MAP 0 + MAP_ADD for large dicts (>= 16 pairs)

Match CPython's compiler behavior: dicts with 16+ key-value pairs
use BUILD_MAP 0 followed by MAP_ADD for each pair, instead of
pushing all keys/values on the stack and calling BUILD_MAP N.

* Fix clippy warnings and cargo fmt

* fix surrogate
@coderabbitai coderabbitai Bot mentioned this pull request Mar 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant