Implement more ast features#6986
Conversation
📝 WalkthroughWalkthroughThis PR refactors the AST system to add comprehensive validation and representation helpers, changes how builtins are stored from PyDictRef to PyObjectRef throughout the frame and function system, expands documentation entries, and enhances string handling for f-strings and template strings with normalization and escape-sequence warnings. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant VM
participant Frame
participant Module
participant Builtins
Client->>VM: run_code_obj(code_obj, scope)
VM->>Module: resolve builtins from __builtins__
alt Module has PyModule __builtins__
Module->>Builtins: dict()
else Module has direct builtins
Module->>Builtins: use as-is
else No __builtins__
VM->>Builtins: use vm.builtins.dict()
end
VM->>Frame: new(builtins: PyObjectRef)
Frame->>Client: Frame ready (builtins: PyObjectRef)
Client->>Frame: load_global_or_builtin(name)
alt builtins is dict
Frame->>Builtins: fast path dict lookup
else builtins is non-dict
Frame->>Builtins: generic path getattr
end
Builtins->>Client: value or error
sequenceDiagram
participant Compiler
participant Parser
participant Validator
participant AST_Module
Compiler->>Parser: parse(source, mode, feature_version)
Parser->>Parser: apply target_version defaults
Parser->>AST_Module: emit AST nodes
alt mode == "single"
Validator->>AST_Module: wrap_interactive(Module → Interactive)
end
Validator->>Validator: validate_mod(module)
Validator->>Validator: check context consistency, body non-empty, patterns
alt validation fails
Validator->>Compiler: PyResult error
else validation succeeds
Validator->>Compiler: validated AST
end
Compiler->>Compiler: compile to code object
Estimated code review effort🎯 4 (Complex) | ⏱️ ~70 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Sorry, something went wrong.
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin ast |
Sorry, something went wrong.
395cffb to
99862cd
Compare
February 4, 2026 14:45
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] lib: cpython/Lib/ast.py dependencies:
dependent tests: (48 tests)
[x] test: cpython/Lib/test/test_builtin.py (TODO: 25) dependencies: dependent tests: (no tests depend on builtin) [ ] test: cpython/Lib/test/test_funcattrs.py (TODO: 5) dependencies: dependent tests: (no tests depend on funcattrs) [ ] lib: cpython/Lib/traceback.py dependencies:
dependent tests: (32 tests)
Legend:
|
Sorry, something went wrong.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@crates/vm/src/frame.rs`:
- Around line 1209-1234: The LOAD_BUILD_CLASS handler currently assumes
__builtins__ is a mapping and calls get_item, which causes an uncaught TypeError
when __builtins__ is a module; change the else branch of the
Instruction::LoadBuildClass handling (the branch after
self.builtins.downcast_ref::<PyDict>()) to detect module-style builtins
(downcast_ref::<PyModule>() or equivalent) and call get_attr(identifier!(vm,
__build_class__), vm) for modules, translating AttributeError into a NameError
the same way you handle KeyError for mappings; keep the existing get_item path
for dict/mapping objects. Apply the same fix to the slow-path global/builtin
name lookup routine (the name-lookup code referenced in the review) so it uses
get_attr for module builtins and preserves existing error translation logic for
mapping builtins.
In `@crates/vm/src/stdlib/ast/string.rs`:
- Around line 50-134: The merge logic currently keeps the first constant's
TextRange when collapsing adjacent string literals, so location info becomes the
first-span-only; update normalize_joined_str_parts and
normalize_template_str_parts so that when you append a new constant (in the
branches handling JoinedStrPart::Constant and TemplateStrPart::Constant) you
also extend the pending range to cover both spans (e.g., set pending.2 =
TextRange::new(pending.start(), constant.range.end()) or use the project's
helper like TextRange::cover/join to set the pending range to span from the
original pending start to constant.range.end), and ensure
push_joined_str_literal and push_template_str_literal use that extended range
when building Constant::new_str; apply the identical change for both joined and
template paths so the merged literal range extends to the last fragment.
In `@crates/vm/src/stdlib/ast/validate.rs`:
- Around line 349-354: The NamedExpr handler only checks that named.target is an
ast::Expr::Name but doesn't validate it in Store context; after the existing
type check for ast::Expr::Named (and the matches!(&*named.target,
ast::Expr::Name(_)) check), call validate_expr(vm, &named.target,
ast::ExprContext::Store) to validate the target as an assignment target and
propagate any error, then validate the value with validate_expr(vm,
&named.value, ast::ExprContext::Load) as already done; keep the existing type
error message if the target isn't a Name.
🧹 Nitpick comments (3)
crates/vm/src/stdlib/ast.rs (2)
398-470: Consider handling edge cases in func_type parsing.The
parse_func_typefunction has good structure but a few observations:
Lines 405-406:
optimizeandtarget_versionare unused (explicitly ignored withlet _ =). This is intentional but worth noting.Line 437: When extracting
right, the offsetsplit_at + 2assumes->is always 2 bytes, which is correct for ASCII but the comment/documentation should clarify this.Lines 457-461: The match arm for other expression types (
other => vec![other]) is a catch-all that may accept invalid func_type syntax silently.💡 Optional: Add validation for unexpected expression types
let argtypes: Vec<ast::Expr> = match arg_expr { ast::Expr::Tuple(tup) => tup.elts, ast::Expr::Name(_) | ast::Expr::Subscript(_) | ast::Expr::Attribute(_) => vec![arg_expr], - other => vec![other], + // CPython accepts any expression as a type annotation + other => vec![other], };
472-503: Minor: redundant assignment on Line 487.The line
let tag = if tag.is_empty() { "" } else { tag };is a no-op sincetagalready equals itself in both branches.♻️ Simplify redundant conditional
let tag = rest.trim_start(); - let tag = if tag.is_empty() { "" } else { tag }; let node = NodeAstcrates/vm/src/stdlib/ast/python.rs (1)
433-513: Redundant setup inmodule_exec.The
module_execfunction (lines 440-477) repeats the same setup thatextend_classalready performs inNodeAst. This appears to be setting up theASTclass that's registered in the module, which may be a different type object than the one defined via#[pyattr].If this duplication is intentional (to handle both the pyattr-registered type and the module-registered type), consider adding a comment explaining why.
💡 Add clarifying comment
pub(crate) fn module_exec( vm: &VirtualMachine, module: &Py<crate::builtins::PyModule>, ) -> PyResult<()> { __module_exec(vm, module); super::super::pyast::extend_module_nodes(vm, module); + // The AST class registered in extend_module_nodes needs its attributes + // and methods configured separately from the #[pyattr] NodeAst type. let ast_type = module .get_attr("AST", vm)?
Sorry, something went wrong.
684e880
into
RustPython:main
Feb 5, 2026
Summary by CodeRabbit
New Features
Bug Fixes
Documentation