◐ Shell
reader mode source ↗
Skip to content

Bytecode parity - direct loop backedges#7578

Merged
youknowone merged 1 commit into
RustPython:mainfrom
youknowone:bytecode-parity-jump
Apr 10, 2026
Merged

Bytecode parity - direct loop backedges#7578
youknowone merged 1 commit into
RustPython:mainfrom
youknowone:bytecode-parity-jump

Conversation

@youknowone

@youknowone youknowone commented Apr 10, 2026

Copy link
Copy Markdown
Member

Summary by CodeRabbit

  • Bug Fixes

    • Fixed loop return stack cleanup behavior to correctly align with CPython's iterator-slot management.
  • Chores

    • Enhanced compiler bytecode generation pipeline with improved jump threading and block reordering optimizations.
    • Updated bytecode inspection tools to use raw bytecode representation for accurate analysis.

@coderabbitai

coderabbitai Bot commented Apr 10, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

The pull request modifies the RustPython compiler's intermediate representation and bytecode generation. Changes include: refactoring loop return stack cleanup in compile.rs, extending the CFG transformation pipeline in ir.rs with new jump threading and block reordering utilities, and updating dis_dump.py to disable instruction normalization for raw bytecode parity.

Changes

Cohort / File(s) Summary
Compiler loop and control flow
crates/codegen/src/compile.rs
Updated Compiler::unwind_fblock to emit Swap { i: 2 } followed by PopTop for ForLoop returns when preserve_tos is set, instead of PopIter, to match CPython's iterator-slot cleanup. Added two bytecode-structure tests validating jump threading around STORE_SUBSCR and loop return control-flow ordering.
CFG transformation and jump threading pipeline
crates/codegen/src/ir.rs
Added set_to_nop helper for consistent NOP conversion. Extended jump threading into jump_threading_unconditional variant and refactored original into jump_threading_impl. Introduced new CFG pipeline stage with reorder_conditional_exit_and_jump_blocks and reorder_conditional_jump_and_exit_blocks predicates. Added duplicate_jump_targets_without_lineno to improve line-number resolution around jump-only blocks.
Bytecode instruction normalization
scripts/dis_dump.py
Disabled instruction normalization and superinstruction decomposition by clearing SKIP_OPS and _OPNAME_NORMALIZE, and removing _SUPER_DECOMPOSE logic. Updated jump-target mapping to operate on raw (un-normalized, un-decomposed) bytecode indices.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Bytecode parity - exception #7557: Modifies scripts/dis_dump.py instruction normalization and jump-target index mapping to use raw bytecode stream indices instead of decomposed/normalized indices.

Poem

🐰 The loop returns now reorder clean,
Jump threads race through bytecode stream,
Stack swaps dance where cleanups gleam,
Blocks reorder, CFG supreme! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Bytecode parity - direct loop backedges' directly relates to the main changes in the pull request, which focus on fixing loop backedge bytecode generation and adding jump threading for unconditional jumps to achieve bytecode parity with CPython.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone force-pushed the bytecode-parity-jump branch from 35f5642 to 34a5afb Compare April 10, 2026 08:29
@youknowone youknowone force-pushed the bytecode-parity-jump branch from 34a5afb to e372959 Compare April 10, 2026 09:03
@youknowone youknowone marked this pull request as ready for review April 10, 2026 09:36

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/codegen/src/compile.rs`:
- Around line 11158-11163: Rename the misspelled identifier kwonlydefaults in
the test fixture to a cspell-friendly name (e.g., kw_only_defaults) by updating
the parameter name in the function f signature and every use inside the function
body (references to kwonlydefaults and checks like "if kwonlydefaults and kwarg
in kwonlydefaults" should use the new name); ensure all other test fixtures or
assertions that reference this parameter are updated consistently (function f,
parameters kwonlyargs and arg2value, and any access/assignment to the defaults
mapping).

In `@crates/codegen/src/ir.rs`:
- Around line 2737-2763: The code inspects the target block directly
(blocks[target.idx()]) which can be an empty placeholder and miss trampoline
chains; instead, before checking the target's first non-NOP instruction, resolve
the real target by threading through empty blocks using next_nonempty_block
(i.e. call next_nonempty_block(blocks, target) and update target to that
returned BlockIdx, skipping if it becomes BlockIdx::NULL) so the subsequent
logic that computes target_jump/target_ins operates on the first non-empty
block; retain existing checks for include_conditional, is_conditional_jump, and
other conditions.
- Around line 3551-3562: When cloning the jump-only block you adjust
predecessors for the trampoline itself but forget to increment the predecessor
count for the trampoline's downstream target (old_next), causing stale fan-in;
after you set new_block.next = old_next, push the new predecessor for the cloned
block (predecessors.push(1)), also increment predecessors[old_next.idx()] by one
(with a bounds/existence check if necessary) so the successor's incoming edge
count reflects the new clone — reference symbols: new_block, old_next, new_idx,
predecessors, target, blocks, propagate_locations_in_block.

In `@scripts/dis_dump.py`:
- Around line 182-193: The current loop that builds filtered and offset_to_idx
(iterating over raw and using normalized_idx with _SUPER_DECOMPOSE and
_OPNAME_NORMALIZE) produces a logical-instruction index map, not a
raw-bytecode-slot map; update the loop that builds filtered/offset_to_idx to
account for cache/prefix slots by inspecting each instruction's cache_info (or
otherwise using a low-level bytecode API) so offset_to_idx maps actual raw
bytecode slots used by compare_bytecode.py, e.g., when inst.cache_info indicates
extra cache entries increment normalized_idx and reserve corresponding slot
indices before mapping inst.offset; modify the code near the
filtered/offset_to_idx construction (the loop over raw, SKIP_OPS handling, and
uses of _SUPER_DECOMPOSE/_OPNAME_NORMALIZE) to synthesize those cache/prefix
slots or switch to a bytecode-level API.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 273c6c67-97cb-475e-9240-bd988f06cbe8

📥 Commits

Reviewing files that changed from the base of the PR and between a49ce5b and e372959.

📒 Files selected for processing (3)
  • crates/codegen/src/compile.rs
  • crates/codegen/src/ir.rs
  • scripts/dis_dump.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant