◐ Shell
clean mode source ↗

Fix segfault on cyclic or deeply-nested AST in `compile()` by changjoon-park · Pull Request #7630 · RustPython/RustPython

Closes #4862.

Symptom

compile() with a cyclic or pathologically deep AST object (built via the ast module) overflows the native stack and crashes RustPython with SIGSEGV. CPython raises RecursionError: maximum recursion depth exceeded while traversing <kind> node.

Reproduction

import ast
d = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
d.operand = d  # self-reference -> infinite traversal
compile(ast.Expression(d), '<t>', 'eval')
Before After
RustPython SIGSEGV RecursionError: maximum recursion depth exceeded while traversing AST node
CPython 3.14 RecursionError: maximum recursion depth exceeded while traversing 'expr' node (unchanged)

Root cause

The ast_from_object conversion from a Python ast object tree to the internal ruff_python_ast types in crates/vm/src/stdlib/_ast/node.rs recursed unconditionally:

  • Box<T>::ast_from_object called T::ast_from_object once per descent (no guard)
  • Vec<T>::ast_from_object called Node::ast_from_object per element (no guard)

A cycle in the Python-level AST turns either call into unbounded recursion, which overflows the Rust native stack. Rust cannot recover from a stack overflow — the process is killed.

Fix

Wrap both recursive descents in vm.with_recursion(...), which is the existing machinery RustPython uses elsewhere to translate native recursion into a Python-level RecursionError at the configured limit (default 1000). No new infrastructure, no configuration changes.

// Box<T>::ast_from_object — single recursive descent
vm.with_recursion("while traversing AST node", || {
    T::ast_from_object(vm, source_file, object).map(Self::new)
})

// Vec<T>::ast_from_object — per-element descent
vm.extract_elements_with(&object, |obj| {
    vm.with_recursion("while traversing AST node", || {
        Node::ast_from_object(vm, source_file, obj)
    })
})

The "while traversing AST node" label matches CPython's RecursionError message style for this failure mode.

Why both Box<T> and Vec<T> must be guarded

All recursive AST fields use one of these two containers:

  • Box etc. — direct single-child descent (most binary/unary expressions, attribute access, subscripts, etc.)
  • Vec etc. — multi-child descent (list/tuple/set literals, dict values, match patterns, call args, etc.)

An earlier iteration of this patch only guarded Box<T>. Testing showed that List.elts = [self] and Tuple.elts = [self] still crashed because they traverse via Vec<T>. Guarding both closes every cycle path I could construct.

Verification

Direct-cycle cases (were segfaulting; now raise RecursionError)

  • UnaryOp.operand = self
  • BinOp.left = self
  • Call.func = self
  • Attribute.value = self
  • Subscript.value = self
  • IfExp.test = self
  • List.elts = [self]
  • Tuple.elts = [self]
  • Set.elts = [self]
  • Dict.values = [self]
  • ListComp.elt = self

Indirect cycles

  • A.operand = B; B.operand = A — caught on the same counter.

Deep non-cyclic input

Depth Behavior
500 compiles OK
900 compiles OK
999 compiles OK
1500 / 3000 / 5000 RecursionError (safe; no crash)

Regression

$ ./target/release/rustpython -m test test_ast test_compile
All 2 tests OK.
Result: SUCCESS

A new regression test in extra_tests/snippets/stdlib_ast.py exercises 6 representative cycle patterns across both Box and Vec descent paths; it passes on both RustPython (with this patch) and CPython 3.14.

Scope

  • Targeted at the ast_from_object path, which is the only way a cyclic tree can enter RustPython's compilation pipeline (parsed source cannot be cyclic; ruff parser produces trees, not DAGs).
  • ast_to_object (reverse direction) is only called on trees produced by ruff_python_parser, which does not emit cycles; no user-controlled path reaches it.
  • validate_mod runs after ast_from_object. Since ast_from_object now caps depth via the VM's recursion counter, any AST reaching validate_mod is already depth-bounded; local testing at depth 999 confirms validate_mod handles it without crashing.

Why this approach (vs. a dedicated counter in codegen)

The initial attempt placed a depth counter inside Compiler::compile_expression (crates/codegen/src/compile.rs). That turned out to be downstream of the crash site — the process is already dead before reaching codegen. The fix lives where the crash happens: the ast_from_object traversal. Using the VM's existing with_recursion keeps the mechanism consistent with how RustPython already converts native overflows to RecursionError elsewhere (e.g. during __repr__ recursion, __eq__ recursion).

Summary by CodeRabbit

  • Bug Fixes

    • Improved handling of cyclic and pathologically deep AST structures to prevent stack overflow and raise proper recursion errors instead.
  • Tests

    • Added comprehensive regression tests for cyclic AST nodes to ensure proper error handling across various node types.