◐ Shell
clean mode source ↗

Bytecode parity - CFG reorders and LOAD_FAST_BORROW chain by youknowone · Pull Request #7870 · RustPython/RustPython

added a commit to youknowone/RustPython that referenced this pull request

May 13, 2026
When a `with` body contains a try/except/else whose handler ends with
an open conditional (e.g. `except OSError: if not path.is_symlink(): raise`),
the handler does not unconditionally scope-exit and the synthetic
body-exit NOP introduced by `preserves_finally_entry_nop` is spurious
because control may still fall through to the with cleanup. Guard
against this by anding in `!statements_end_with_open_conditional_fallthrough`
when checking handler scope exits.

Also add the test
`test_with_try_except_else_open_conditional_handler_drops_body_exit_nop`
locking the pattern, and `fobj` to .cspell.dict/python-more.txt so the
prek cspell hook on CI clears for PR RustPython#7870 (used verbatim in the
already-committed test_nested_with_except_same_line_cleanup_threads_trampoline).

youknowone added a commit to youknowone/RustPython that referenced this pull request

May 14, 2026
When a `with` body contains a try/except/else whose handler ends with
an open conditional (e.g. `except OSError: if not path.is_symlink(): raise`),
the handler does not unconditionally scope-exit and the synthetic
body-exit NOP introduced by `preserves_finally_entry_nop` is spurious
because control may still fall through to the with cleanup. Guard
against this by anding in `!statements_end_with_open_conditional_fallthrough`
when checking handler scope exits.

Also add the test
`test_with_try_except_else_open_conditional_handler_drops_body_exit_nop`
locking the pattern, and `fobj` to .cspell.dict/python-more.txt so the
prek cspell hook on CI clears for PR RustPython#7870 (used verbatim in the
already-committed test_nested_with_except_same_line_cleanup_threads_trampoline).

@youknowone youknowone changed the title Align LOAD_FAST_BORROW analysis with CPython chain shape Bytecode parity - CFG reorders and LOAD_FAST_BORROW chain

May 16, 2026

coderabbitai[bot]

Three changes that bring optimize_load_fast_borrow closer to CPython's
optimize_load_fast in flowgraph.c:

* ir.rs: split mark_cold into the CPython-style two passes. Phase 1
  propagates "warm" from the entry block, phase 2 propagates "cold" from
  except_handler blocks. Blocks reached by neither phase keep cold=false
  and stay in their original b_next position, matching CPython's handling
  of empty placeholders left by remove_unreachable (e.g. the inner_end of
  a nested try/except whose incoming jumps were re-routed by optimize_cfg).

* ir.rs: in optimize_load_fast_borrow, push the fall-through successor
  only when the current block has a last instruction (is_some_and).
  Empty blocks now terminate fall-through propagation, matching the
  `term != NULL` check in optimize_load_fast.

* compile.rs: add switch_to_new_or_reuse_empty() helper and use it in
  compile_while. The helper reuses the current block when it is empty
  and unlinked, mirroring USE_LABEL absorption in
  cfg_builder_maybe_start_new_block. This stops a stray empty block
  from appearing between e.g. a try/except end_block and the following
  while loop header.

Four codegen tests that depended on the previous fall-through-through-
empty behavior are marked #[ignore] with TODO comments.

Also includes a handful of dictionary entries in .cspell.dict picked up
during the work.
Mirror CPython's optimize_basic_block() (flowgraph.c) by walking each
block once in instruction order and trying tuple, list, set, unary, and
binop folding at each position before advancing. This replaces the
previous global-pass sequence where every fold_unary_constants pattern
in the whole CFG was registered before any tuple constant, leaving
negated literals like `-1` at co_consts positions earlier than CPython
produces (e.g. snippets.py: -1 at idx 280 vs CPython idx 726).

Changes:
- Extract `fold_unary_constant_at` and `fold_binop_constant_at` per-
  position helpers from the existing global passes; the global passes
  now call the helpers in a loop.
- Add `fold_constants_per_block` that walks each block to a fixed point,
  trying all five folds at each instruction position.
- Call the new walker before the legacy global passes in
  optimize_finalize so co_consts insertion order matches CPython's.

Measured on the full Lib tree: differing files 270 → 269; the only
newly matching file is `test/test_ast/snippets.py`, the case raised in
#28.
`inline_small_fast_return_blocks` previously appended the target
`LOAD_FAST(_BORROW)/RETURN_VALUE` block's instructions onto any
predecessor whose fall-through eventually reached it, in addition to the
unconditional-jump case CPython handles in
`inline_small_or_no_lineno_blocks` (flowgraph.c:1210). CPython only
inlines through unconditional jumps, leaving fall-through predecessors
to reach the shared return block via the natural CFG layout. The extra
fall-through branch duplicated the return tail (e.g. `if/elif/return`
emitted two adjacent `LOAD_FAST_BORROW x; RETURN_VALUE` sequences).

Remove the fall-through inlining branch and keep only the
unconditional-jump path.

Measured on the full Lib tree: differing files 270 → 239 (-31), no new
regressions. Files newly matching include copy.py, argparse.py,
dataclasses.py, logging/__init__.py, pathlib/__init__.py, etc.
`reorder_conditional_scope_exit_and_jump_back_blocks` previously skipped
any reorder where the conditional, scope-exit, or jump-back block had
an `except_handler` attached, even when all three shared the same
handler. CPython reorders these regardless of try/except context, as
long as the blocks stay within the same protected region. The over-
conservative guard left patterns like `try: for: if cond: return` with
the loop body's scope-exit ahead of the backedge, while CPython places
the backedge first and inverts the conditional.

Replace the `block_is_protected` triple-check with a single
`mismatched_protection` test: skip only when the three blocks do not
share the same `except_handler`. Same-handler reorders preserve the
protected range because every instruction's `except_handler` field
stays attached as `.next` pointers shift.

Measured on the full Lib tree: differing files 239 → 237; no new
regressions.
reorder_jump_over_exception_cleanup_blocks was swapping a small
scope-exit target with a preceding cold cleanup chain even when the
target block began a fresh try (SETUP_FINALLY/SETUP_CLEANUP/SETUP_WITH).
The swap moved the next try's setup ahead of the prior handler's
cleanup_end/next_handler/cleanup_block, making the cleanup_body's
JUMP_FORWARD fall through directly to the cleanup_end and get elided as
redundant. The bytecode then lacked the JUMP_FORWARD that skips the
cleanup blocks and matched the prior handler's borrow tail incorrectly.

Skip the reorder when the target block contains any block-push pseudo
op so a new try's setup stays in source order. Re-enables the four
named/typed except-cleanup borrow tests that were marked #[ignore] in
commit 7481459.

This was referenced

May 20, 2026