Bytecode parity - CFG reorders and LOAD_FAST_BORROW chain by youknowone · Pull Request #7870 · RustPython/RustPython
added a commit to youknowone/RustPython that referenced this pull request
When a `with` body contains a try/except/else whose handler ends with an open conditional (e.g. `except OSError: if not path.is_symlink(): raise`), the handler does not unconditionally scope-exit and the synthetic body-exit NOP introduced by `preserves_finally_entry_nop` is spurious because control may still fall through to the with cleanup. Guard against this by anding in `!statements_end_with_open_conditional_fallthrough` when checking handler scope exits. Also add the test `test_with_try_except_else_open_conditional_handler_drops_body_exit_nop` locking the pattern, and `fobj` to .cspell.dict/python-more.txt so the prek cspell hook on CI clears for PR RustPython#7870 (used verbatim in the already-committed test_nested_with_except_same_line_cleanup_threads_trampoline).
youknowone added a commit to youknowone/RustPython that referenced this pull request
When a `with` body contains a try/except/else whose handler ends with an open conditional (e.g. `except OSError: if not path.is_symlink(): raise`), the handler does not unconditionally scope-exit and the synthetic body-exit NOP introduced by `preserves_finally_entry_nop` is spurious because control may still fall through to the with cleanup. Guard against this by anding in `!statements_end_with_open_conditional_fallthrough` when checking handler scope exits. Also add the test `test_with_try_except_else_open_conditional_handler_drops_body_exit_nop` locking the pattern, and `fobj` to .cspell.dict/python-more.txt so the prek cspell hook on CI clears for PR RustPython#7870 (used verbatim in the already-committed test_nested_with_except_same_line_cleanup_threads_trampoline).
youknowone
changed the title
Align LOAD_FAST_BORROW analysis with CPython chain shape
Bytecode parity - CFG reorders and LOAD_FAST_BORROW chain
Three changes that bring optimize_load_fast_borrow closer to CPython's optimize_load_fast in flowgraph.c: * ir.rs: split mark_cold into the CPython-style two passes. Phase 1 propagates "warm" from the entry block, phase 2 propagates "cold" from except_handler blocks. Blocks reached by neither phase keep cold=false and stay in their original b_next position, matching CPython's handling of empty placeholders left by remove_unreachable (e.g. the inner_end of a nested try/except whose incoming jumps were re-routed by optimize_cfg). * ir.rs: in optimize_load_fast_borrow, push the fall-through successor only when the current block has a last instruction (is_some_and). Empty blocks now terminate fall-through propagation, matching the `term != NULL` check in optimize_load_fast. * compile.rs: add switch_to_new_or_reuse_empty() helper and use it in compile_while. The helper reuses the current block when it is empty and unlinked, mirroring USE_LABEL absorption in cfg_builder_maybe_start_new_block. This stops a stray empty block from appearing between e.g. a try/except end_block and the following while loop header. Four codegen tests that depended on the previous fall-through-through- empty behavior are marked #[ignore] with TODO comments. Also includes a handful of dictionary entries in .cspell.dict picked up during the work.
Mirror CPython's optimize_basic_block() (flowgraph.c) by walking each block once in instruction order and trying tuple, list, set, unary, and binop folding at each position before advancing. This replaces the previous global-pass sequence where every fold_unary_constants pattern in the whole CFG was registered before any tuple constant, leaving negated literals like `-1` at co_consts positions earlier than CPython produces (e.g. snippets.py: -1 at idx 280 vs CPython idx 726). Changes: - Extract `fold_unary_constant_at` and `fold_binop_constant_at` per- position helpers from the existing global passes; the global passes now call the helpers in a loop. - Add `fold_constants_per_block` that walks each block to a fixed point, trying all five folds at each instruction position. - Call the new walker before the legacy global passes in optimize_finalize so co_consts insertion order matches CPython's. Measured on the full Lib tree: differing files 270 → 269; the only newly matching file is `test/test_ast/snippets.py`, the case raised in #28.
`inline_small_fast_return_blocks` previously appended the target `LOAD_FAST(_BORROW)/RETURN_VALUE` block's instructions onto any predecessor whose fall-through eventually reached it, in addition to the unconditional-jump case CPython handles in `inline_small_or_no_lineno_blocks` (flowgraph.c:1210). CPython only inlines through unconditional jumps, leaving fall-through predecessors to reach the shared return block via the natural CFG layout. The extra fall-through branch duplicated the return tail (e.g. `if/elif/return` emitted two adjacent `LOAD_FAST_BORROW x; RETURN_VALUE` sequences). Remove the fall-through inlining branch and keep only the unconditional-jump path. Measured on the full Lib tree: differing files 270 → 239 (-31), no new regressions. Files newly matching include copy.py, argparse.py, dataclasses.py, logging/__init__.py, pathlib/__init__.py, etc.
`reorder_conditional_scope_exit_and_jump_back_blocks` previously skipped any reorder where the conditional, scope-exit, or jump-back block had an `except_handler` attached, even when all three shared the same handler. CPython reorders these regardless of try/except context, as long as the blocks stay within the same protected region. The over- conservative guard left patterns like `try: for: if cond: return` with the loop body's scope-exit ahead of the backedge, while CPython places the backedge first and inverts the conditional. Replace the `block_is_protected` triple-check with a single `mismatched_protection` test: skip only when the three blocks do not share the same `except_handler`. Same-handler reorders preserve the protected range because every instruction's `except_handler` field stays attached as `.next` pointers shift. Measured on the full Lib tree: differing files 239 → 237; no new regressions.
reorder_jump_over_exception_cleanup_blocks was swapping a small scope-exit target with a preceding cold cleanup chain even when the target block began a fresh try (SETUP_FINALLY/SETUP_CLEANUP/SETUP_WITH). The swap moved the next try's setup ahead of the prior handler's cleanup_end/next_handler/cleanup_block, making the cleanup_body's JUMP_FORWARD fall through directly to the cleanup_end and get elided as redundant. The bytecode then lacked the JUMP_FORWARD that skips the cleanup blocks and matched the prior handler's borrow tail incorrectly. Skip the reorder when the target block contains any block-push pseudo op so a new try's setup stays in source order. Re-enables the four named/typed except-cleanup borrow tests that were marked #[ignore] in commit 7481459.
This was referenced
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters