◐ Shell
reader mode source ↗
Skip to content

Fix test_import: import machinery, circular imports, and script shadowing#7034

Merged
youknowone merged 2 commits into
RustPython:mainfrom
youknowone:test_import
Feb 8, 2026
Merged

Fix test_import: import machinery, circular imports, and script shadowing#7034
youknowone merged 2 commits into
RustPython:mainfrom
youknowone:test_import

Conversation

@youknowone

@youknowone youknowone commented Feb 7, 2026

Copy link
Copy Markdown
Member
  • Emit IMPORT_FROM instead of LOAD_ATTR for import a.b.c as m (compile.rs)
  • Add "partially initialized module" error in module getattr for circular imports
  • Add "cannot access submodule" error for initializing submodules
  • Implement "consider renaming" script shadowing detection (module.rs, frame.rs)
    • Detect when a user script shadows a stdlib or third-party module
    • Compute original sys.path[0] from sys.argv[0] (like CPython's config->sys_path_0)
    • Check sys.stdlib_module_names for stdlib module detection
    • Respect safe_path setting
  • Implement _imp._fix_co_filename to rewrite code object source_path in-place
  • Add data parameter support to _imp.get_frozen_object with marshal deserialization
  • Fix import_from error messages: check spec.has_location before using origin
  • Set ImportError.path attribute on import failures
  • Fix import_star error messages for non-str items in all and dict
  • Always call builtins.import (remove sys.modules cache bypass in import_inner)

Summary by CodeRabbit

  • New Features

    • Better support for importing dotted module paths with aliases.
    • Additional frozen-module aliases so more built-in modules are discoverable.
    • Import system improvements that make more encodings and submodules resolvable at startup.
  • Bug Fixes

    • Richer, more informative import error messages including module origin and stdlib shadowing hints.
    • Improved handling of partially initialized modules and circular-import diagnostics.
    • More robust handling of all during star imports and attribute export resolution.

@coderabbitai

coderabbitai Bot commented Feb 7, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Refactors import machinery and dotted-import codegen: adds importlib wiring and helpers for origin/spec, stdlib-shadow detection, richer ImportError/AttributeError messages, enhanced frozen-module aliasing and frozen-code handling, plus codegen changes to expand dotted imports with aliases.

Changes

Cohort / File(s) Summary
Import core & VM wiring
crates/vm/src/import.rs, crates/vm/src/vm/mod.rs, crates/vm/src/vm/thread.rs
Add import helpers (get_spec_file_origin, is_possibly_shadowing_path, is_stdlib_module_name, import_module_level, calc_package, etc.), add importlib field to VirtualMachine, propagate to thread VMs, and wire importlib into VM init and import flows.
Builtin import / stdlib import helpers
crates/vm/src/stdlib/builtins.rs, crates/vm/src/stdlib/imp.rs
Introduce typed ImportArgs for __import__ and call import_module_level directly; expand frozen-module lookup, add optional deserialization path for frozen code, and add code-filename updating helpers.
Error/messages & module attribute handling
crates/vm/src/frame.rs, crates/vm/src/builtins/module.rs
Compute module origin/spec, detect stdlib-shadowing and uninitialized submodules, attach path metadata to ImportError, and emit context-aware AttributeError/ImportError messages (including circular-import context).
Codegen dotted-import handling
crates/codegen/src/compile.rs
Change handling of dotted imports with aliases to emit per-segment ImportFrom and manage stack with Swap/PopTop so aliasing dotted names preserves evaluation order.
Frozen modules & interpreter aliasing
crates/vm/src/vm/interpreter.rs, crates/vm/src/vm/mod.rs
Add runtime frozen-module alias injections (e.g., __hello_alias__, __phello_alias__, __hello_only__), adjust encoding registration to use import machinery/sys.modules.
Library addition
crates/vm/Lib/python_builtins/__hello_only__.py
Add a small file that references the standard __hello_only__ implementation (acts as a pointer/alias).

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Caller
    participant Builtins as "__import__ (builtins)"
    participant Import as "import_module_level"
    participant Importlib as "vm.importlib"
    participant Sys as "sys.modules / _find_and_load"
    participant Frozen as "frozen lookup"

    Caller->>Builtins: __import__(name, globals, locals, fromlist, level)
    Builtins->>Import: import_module_level(name, globals, fromlist, level)
    Import->>Importlib: consult importlib / __spec__ (origin)
    Import->>Sys: check sys.modules for existing module
    alt found in sys.modules
        Sys-->>Import: module (may be initializing)
        Import->>Caller: return module or raise partial-init ImportError (with origin/path)
    else not found
        Import->>Frozen: check frozen modules & aliases
        Frozen-->>Import: frozen module or not found
        alt module found
            Import->>Importlib: load/execute module, set __spec__/__loader__
            Import->>Caller: return module
        else not found
            Import->>Caller: raise ImportError (with origin/path metadata)
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

skip:ci

Suggested reviewers

  • ShaharNaveh
  • arihant2math
  • fanninpm

Poem

🐰 I hopped through import paths, one by one,
Unwound dotted names till the job was done.
I chased each spec, traced origin and shade,
Tied aliases neat where tangled names were made.
Now modules greet me — the rabbit's work is fun! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main changes: import machinery fixes, circular import handling, and script shadowing detection across multiple subsystems.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Feb 7, 2026

Copy link
Copy Markdown
Contributor

Code has been automatically formatted

The code in this PR has been formatted using:

  • cargo fmt --all
    Please pull the latest changes before pushing again:
git pull origin test_import

@youknowone youknowone force-pushed the test_import branch 10 times, most recently from 61d81a1 to 68aa709 Compare February 8, 2026 14:45
@youknowone youknowone marked this pull request as ready for review February 8, 2026 14:45

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/frame.rs (1)

386-393: ⚠️ Potential issue | 🟡 Minor

__dict__ fallback calls require_str which raises TypeError on non-string keys — CPython only issues a DeprecationWarning and skips.

In CPython's import_all_from, when iterating __dict__ (no __all__), non-string keys are skipped with a DeprecationWarning, not a hard TypeError. This code would raise where CPython would continue. If strict CPython compatibility is desired for test_import, consider filtering/skipping non-string keys instead.

🤖 Fix all issues with AI agents
In `@crates/vm/src/import.rs`:
- Around line 347-471: The has_from truthiness check in import_module_level
currently swallows errors from fl.clone().try_to_bool(vm) via .ok(), so any
exception from __bool__ is ignored; change the logic around the fromlist
handling (the has_from computation) to call try_to_bool(vm) directly and
propagate any Err returned (instead of mapping to Option via .ok()), so that
errors from __bool__ bubble up as PyErrs; locate the has_from block around the
fromlist variable and replace the .and_then(|fl|
fl.clone().try_to_bool(vm).ok()) pattern with code that handles try_to_bool(vm)
-> Result<bool, PyErr> and returns Err when it fails.

In `@crates/vm/src/stdlib/imp.rs`:
- Around line 286-309: The unsafe cast in update_code_filenames that writes to
code.code.source_path must be replaced by safe interior mutability: change the
CodeObject field source_path to an interior-mutable type (e.g., Cell<&'static
PyStrInterned> or an AtomicPtr-backed wrapper) and update its initialization
sites accordingly, then modify update_code_filenames to call the safe setter
(e.g., code.code.source_path.set(new_name) or swap the pointer atomically)
instead of performing the raw pointer write; ensure recursive traversal (the
loop calling update_code_filenames on nested PyCode) and any synchronization via
existing VM locks (IMP_LOCK) remain correct.
🧹 Nitpick comments (4)
crates/vm/src/vm/interpreter.rs (1)

466-521: Consider extracting a small helper to reduce repetition in alias injection.

The find-then-push pattern is repeated three times with minor variations. A small closure or helper could reduce the boilerplate while keeping intent clear.

♻️ Suggested refactor
     // Collect and add frozen module aliases for test modules
     let mut entries: Vec<_> = iter.collect();
+
+    let find_code = |entries: &[(&'static str, FrozenModule)], name: &str| {
+        entries.iter().find(|(n, _)| *n == name).map(|(_, m)| m.code)
+    };
+
-    if let Some(hello_code) = entries
-        .iter()
-        .find(|(n, _)| *n == "__hello__")
-        .map(|(_, m)| m.code)
-    {
+    if let Some(hello_code) = find_code(&entries, "__hello__") {
         entries.push(("__hello_alias__", FrozenModule { code: hello_code, package: false }));
         entries.push(("__phello_alias__", FrozenModule { code: hello_code, package: true }));
         entries.push(("__phello_alias__.spam", FrozenModule { code: hello_code, package: false }));
     }
-    if let Some(code) = entries
-        .iter()
-        .find(|(n, _)| *n == "__phello__")
-        .map(|(_, m)| m.code)
-    {
+    if let Some(code) = find_code(&entries, "__phello__") {
         entries.push(("__phello__.__init__", FrozenModule { code, package: false }));
     }
-    if let Some(code) = entries
-        .iter()
-        .find(|(n, _)| *n == "__phello__.ham")
-        .map(|(_, m)| m.code)
-    {
+    if let Some(code) = find_code(&entries, "__phello__.ham") {
         entries.push(("__phello__.ham.__init__", FrozenModule { code, package: false }));
     }
crates/vm/src/import.rs (1)

242-265: Prefer Option<&PyObjectRef> over &Option<PyObjectRef>.

Clippy flags &Option<T> in function signatures (clippy::ref_option). Using Option<&PyObjectRef> is more idiomatic—it avoids an unnecessary indirection and callers can simply pass spec.as_ref().

♻️ Proposed fix
-pub(crate) fn get_spec_file_origin(
-    spec: &Option<PyObjectRef>,
-    vm: &VirtualMachine,
-) -> Option<String> {
-    let spec = spec.as_ref()?;
+pub(crate) fn get_spec_file_origin(
+    spec: Option<&PyObjectRef>,
+    vm: &VirtualMachine,
+) -> Option<String> {
+    let spec = spec?;

As per coding guidelines, "Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints that are introduced by your changes."

crates/vm/src/frame.rs (1)

3298-3309: Unnecessary per-call string allocation — prefer &str or interned identifiers.

vm.ctx.new_str("__spec__") and vm.ctx.new_str("_initializing") allocate new PyStr objects every time is_module_initializing is called. Elsewhere in this file (e.g., line 278), raw &str is passed directly to get_attr. The same pattern works here.

♻️ Proposed fix
 fn is_module_initializing(module: &PyObject, vm: &VirtualMachine) -> bool {
-    let Ok(spec) = module.get_attr(&vm.ctx.new_str("__spec__"), vm) else {
+    let Ok(spec) = module.get_attr("__spec__", vm) else {
         return false;
     };
     if vm.is_none(&spec) {
         return false;
     }
-    let Ok(initializing_attr) = spec.get_attr(&vm.ctx.new_str("_initializing"), vm) else {
+    let Ok(initializing_attr) = spec.get_attr("_initializing", vm) else {
         return false;
     };
     initializing_attr.try_to_bool(vm).unwrap_or(false)
 }
crates/vm/src/stdlib/imp.rs (1)

254-254: Return type readability: consider a type alias for this complex tuple.

Option<(Option<PyRef<PyMemoryView>>, bool, Option<PyStrRef>)> is hard to parse at a glance. A named struct (even file-local) would improve readability, though this is non-blocking.

- Emit IMPORT_FROM instead of LOAD_ATTR for `import a.b.c as m`
- Add "partially initialized module" error for circular imports
- Add "cannot access submodule" error for initializing submodules
- Implement script shadowing detection with "consider renaming" hint
  - Detect user scripts shadowing stdlib/third-party modules
  - Compute original sys.path[0] from sys.argv[0]
  - Check sys.stdlib_module_names for stdlib detection
  - Respect safe_path setting
- Implement _imp._fix_co_filename for code source_path rewriting
- Add data parameter to _imp.get_frozen_object with marshal deser
- Fix import_from: check __spec__.has_location before using origin
- Set ImportError.path attribute on import failures
- Fix import_star error messages for non-str __all__/__dict__ items
- Always call builtins.__import__ in import_inner
Previously, try_to_bool errors were silently swallowed via .ok(),
causing fromlist with broken __bool__ to default to false.
@github-actions

github-actions Bot commented Feb 8, 2026

Copy link
Copy Markdown
Contributor

📦 Library Dependencies

The following Lib/ modules were modified. Here are their dependencies:

(module 'hello_only' not found)

[x] test: cpython/Lib/test/test_frozen.py

dependencies:

dependent tests: (no tests depend on frozen)

[ ] test: cpython/Lib/test/test_import (TODO: 3)

dependencies:

dependent tests: (no tests depend on import)

[x] lib: cpython/Lib/importlib
[ ] test: cpython/Lib/test/test_importlib (TODO: 16)

dependencies:

  • importlib

dependent tests: (67 tests)

  • importlib: test_bdb test_cmd_line_script test_codecs test_compileall test_ctypes test_doctest test_frozen test_hashlib test_importlib test_inspect test_linecache test_multiprocessing_main_handling test_pkgutil test_py_compile test_reprlib test_runpy test_sundry test_support test_tomllib test_unittest test_zipfile test_zipimport test_zoneinfo
    • ensurepip: test_ensurepip test_venv
    • inspect: test_abc test_argparse test_asyncgen test_builtin test_code test_collections test_coroutines test_decimal test_enum test_functools test_generators test_grammar test_ntpath test_operator test_patma test_posixpath test_signal test_traceback test_types test_typing test_unittest test_yield_from
      • asyncio: test_asyncio test_contextlib_async test_logging test_os test_sys_settrace test_unittest
      • dataclasses: test__colorize test_genericalias test_pprint test_regrtest
      • rlcompleter: test_rlcompleter
      • trace: test_trace
      • xmlrpc.server: test_docxmlrpc test_xmlrpc
    • py_compile: test_importlib
    • pyclbr: test_pyclbr
    • zipfile: test_shutil test_zipapp test_zipfile test_zipfile64

Legend:

  • [+] path exists in CPython
  • [x] up-to-date, [ ] outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/vm/src/import.rs`:
- Around line 288-325: Compute and store the original sys.path[0] once at VM
initialization (in init_importlib_base or VM::new) instead of recomputing it
inside import.rs; add a new VM field (e.g., cached_sys_path_0: Option<String> or
sys_path_0) and initialize it by reading sys_module.argv[0] once using the
existing logic, then update the code in import.rs to read vm.cached_sys_path_0
(falling back to previous behavior or returning false if None) rather than
re-deriving from vm.sys_module.get_attr("argv") on every call; ensure the field
name you add matches uses in import-related functions (e.g., replace local
sys_path_0 computation with vm.cached_sys_path_0) and handle empty-string and
current_dir cases the same as before.
🧹 Nitpick comments (1)
crates/vm/src/import.rs (1)

504-542: Silently swallowing rich_compare_bool error on line 519 is acceptable for a warning-only path, but worth a brief comment.

The unwrap_or(false) on the rich_compare_bool result means a broken __ne__ on a package string silently suppresses the deprecation warning rather than propagating the error. This is intentional (we're inside a best-effort warning block), but a one-line // Ignore comparison errors; this is only advisory. would clarify intent for future readers.

Hide details View details @youknowone youknowone merged commit 470bd59 into RustPython:main Feb 8, 2026
14 checks passed
@youknowone youknowone deleted the test_import branch February 8, 2026 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant