Add sys._stdlib_dir by youknowone · Pull Request #6798 · RustPython/RustPython

youknowone

youknowone commented
Jan 19, 2026
•

edited by coderabbitai Bot

Loading

Summary by CodeRabbit

Release Notes

Bug Fixes
- Improved standard library path resolution, particularly on Windows with enhanced lib directory detection for both text-file and symlink-based paths
Refactor
- Centralized standard library directory configuration through interpreter state management

_{✏️ Tip: You can customize this high-level summary in your review settings.}

📝 Walkthrough

Walkthrough

Introduces a new stdlib_dir field to the path configuration system, computed via platform-specific logic in getpath.rs, exposed through the sys module as _stdlib_dir, and wired during interpreter initialization to store the standard library location derived from either frozen or dynamic paths.

Changes

Cohort / File(s)	Summary
Path configuration structure `crates/vm/src/vm/setting.rs`	Added optional `stdlib_dir` field to `Paths` struct to store the standard library directory path
stdlib directory computation `crates/vm/src/getpath.rs`	Introduced private `calculate_stdlib_dir()` helper with platform-specific logic; added Step 9 to `init_path_config()` to assign computed `stdlib_dir` with existence validation
Python API exposure `crates/vm/src/stdlib/sys.rs`	Added `_stdlib_dir` attribute to `sys` module that returns `vm.state.config.paths.stdlib_dir` as a `PyObjectRef`
Interpreter initialization `src/interpreter.rs`	Replaced `vm.sys_module` hack with `PyRc`-based state access; centralized `stdlib_dir` assignment for both frozen and dynamic stdlib path flows
Windows build configuration `crates/pylib/build.rs`	Enhanced Lib path resolution to handle both symlinks and text files (git with/without symlink support); added canonicalization and `\?\` prefix stripping

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Move pylib -> crates/pylib #6225: Modifies build.rs Lib path resolution logic for platform/path handling
nt junction #6407: Strips Windows extended-length prefix (\?\) from canonicalized Lib path in build.rs
Introduce PyConfig to implement Paths correctly #6461: Extends PyConfig/Paths plumbing by adding and wiring the new stdlib_dir field through the path infrastructure

Suggested reviewers

coolreader18
arihant2math

Poem

🐰 A rabbit hops through stdlib ways,
Building paths through winding maze,
From frozen lib to dynamic ground,
stdlib_dir must now be found! 📚✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title 'Add sys._stdlib_dir' directly reflects the main change: adding a new Python-accessible attribute _stdlib_dir to the sys module, which is the primary public-facing feature of this changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Code has been automatically formatted

The code in this PR has been formatted using:

cargo fmt --all
Please pull the latest changes before pushing again:

git pull origin stdlib_dir

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@crates/pylib/build.rs`:
- Around line 14-31: The build script currently calls
canonicalize(...).to_str().unwrap(), which can panic on non‑UTF‑8 Windows paths;
change the emission to use a lossy conversion from the canonicalized Path (e.g.,
Path::to_string_lossy on the value returned by std::fs::canonicalize(&lib_path))
and print that into the cargo:rustc-env=win_lib_path value, and if
std::fs::canonicalize(&lib_path) returns an Err emit a visible build warning via
println!("cargo:warning=...") instead of silently skipping so missing
win_lib_path is obvious; update the code paths referencing canonicalized_path
and to_str().unwrap() accordingly.

Comment on lines +14 to +31

		if cfg!(windows) {
		// On Windows, the Lib entry can be either:
		// 1. A text file containing the relative path (git without symlink support)
		// 2. A proper symlink (git with symlink support)
		// We handle both cases to resolve to the actual Lib directory.
		let lib_path = if let Ok(real_path) = std::fs::read_to_string("Lib") {
		// Case 1: Text file containing relative path
		std::path::PathBuf::from(real_path.trim())
		} else {
		// Case 2: Symlink or directory - canonicalize directly
		std::path::PathBuf::from("Lib")
		};

		if let Ok(canonicalized_path) = std::fs::canonicalize(&lib_path) {
		// Strip the extended path prefix (\\?\) that canonicalize adds on Windows
		let path_str = canonicalized_path.to_str().unwrap();
		let path_str = path_str.strip_prefix(r"\\?\").unwrap_or(path_str);
		println!("cargo:rustc-env=win_lib_path={path_str}");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid panic on non‑UTF‑8 Windows paths.
to_str().unwrap() can panic on valid Windows paths containing non‑UTF‑8 sequences, breaking the build. Prefer a lossy conversion and consider warning when canonicalization fails so missing win_lib_path isn’t silent.

🔧 Proposed safer emission

-        if let Ok(canonicalized_path) = std::fs::canonicalize(&lib_path) {
-            // Strip the extended path prefix (\\?\) that canonicalize adds on Windows
-            let path_str = canonicalized_path.to_str().unwrap();
-            let path_str = path_str.strip_prefix(r"\\?\").unwrap_or(path_str);
-            println!("cargo:rustc-env=win_lib_path={path_str}");
-        }
+        if let Ok(canonicalized_path) = std::fs::canonicalize(&lib_path) {
+            // Strip the extended path prefix (\\?\) that canonicalize adds on Windows
+            let path_str = canonicalized_path.to_string_lossy();
+            let path_str = path_str.strip_prefix(r"\\?\").unwrap_or(&path_str);
+            println!("cargo:rustc-env=win_lib_path={path_str}");
+        } else {
+            println!("cargo:warning=failed to canonicalize Lib path: {:?}", lib_path);
+        }

🤖 Prompt for AI Agents

In `@crates/pylib/build.rs` around lines 14 - 31, The build script currently calls
canonicalize(...).to_str().unwrap(), which can panic on non‑UTF‑8 Windows paths;
change the emission to use a lossy conversion from the canonicalized Path (e.g.,
Path::to_string_lossy on the value returned by std::fs::canonicalize(&lib_path))
and print that into the cargo:rustc-env=win_lib_path value, and if
std::fs::canonicalize(&lib_path) returns an Err emit a visible build warning via
println!("cargo:warning=...") instead of silently skipping so missing
win_lib_path is obvious; update the code paths referencing canonicalized_path
and to_str().unwrap() accordingly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@crates/codegen/src/compile.rs`:
- Around line 6983-6986: The DictMerge path currently uses the first mapping as
the accumulator which can mutate a caller-provided **kwargs dict; change
compile.rs so that when merging multiple mappings (the loop around emit!(self,
Instruction::DictMerge { index: 1 })), you first allocate a fresh empty
accumulator and merge each mapping into that fresh map instead of using the
first input as the target; specifically, before the for-loop that emits
Instruction::DictMerge, emit an instruction to create a new empty dict (so the
accumulator is new) and then merge all incoming `**` groups into that fresh
accumulator (update the code that emits Instruction::DictMerge and the
surrounding logic in the same block to use the new accumulator).

🧹 Nitpick comments (1)

crates/vm/src/frame.rs (1)
962-962: Inconsistent downcast pattern with DictUpdate.

DictUpdate (line 934) uses a safe downcast with .expect():
let dict = dict.downcast_ref::<PyDict>().expect("exact dict expected");
But DictMerge uses an unsafe downcast without a SAFETY comment explaining the invariant. Consider using the same safe pattern as DictUpdate for consistency and defensive coding, or at minimum add a SAFETY comment like the other unsafe downcasts in this file (e.g., ListAppend, ListExtend).
♻️ Suggested fix for consistency
-                let dict: &Py<PyDict> = unsafe { dict_ref.downcast_unchecked_ref() };
+                let dict: &Py<PyDict> = unsafe {
+                    // SAFETY: compiler guarantees correct type
+                    dict_ref.downcast_unchecked_ref()
+                };
Or use the safe pattern:
-                let dict: &Py<PyDict> = unsafe { dict_ref.downcast_unchecked_ref() };
+                let dict = dict_ref.downcast_ref::<PyDict>().expect("exact dict expected");

Comment on lines +6983 to +6986

		// Merge all dicts: first dict is accumulator, merge rest into it
		for _ in 1..size {
		emit!(self, Instruction::DictMerge { index: 1 });
		}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid mutating user-provided **kwargs when merging.
DictMerge uses the first mapping as the accumulator. If the first group is **kwargs, this mutates the caller’s dict when multiple ** groups exist (observable vs CPython, which builds a fresh dict). Please ensure the accumulator is always a new map before merging any ** inputs.

🔧 Possible fix (merge into a fresh accumulator)

-        let mut size = 0;
-        let groupby = keywords.iter().chunk_by(|e| e.arg.is_none());
-        for (is_unpacking, sub_keywords) in &groupby {
-            if is_unpacking {
-                for keyword in sub_keywords {
-                    self.compile_expression(&keyword.value)?;
-                    size += 1;
-                }
-            } else {
-                let mut sub_size = 0;
-                for keyword in sub_keywords {
-                    if let Some(name) = &keyword.arg {
-                        self.emit_load_const(ConstantData::Str {
-                            value: name.as_str().into(),
-                        });
-                        self.compile_expression(&keyword.value)?;
-                        sub_size += 1;
-                    }
-                }
-                emit!(self, Instruction::BuildMap { size: sub_size });
-                size += 1;
-            }
-        }
-        if size > 1 {
-            // Merge all dicts: first dict is accumulator, merge rest into it
-            for _ in 1..size {
-                emit!(self, Instruction::DictMerge { index: 1 });
-            }
-        }
+        let mut have_accumulator = false;
+        let groupby = keywords.iter().chunk_by(|e| e.arg.is_none());
+        for (is_unpacking, sub_keywords) in &groupby {
+            if !have_accumulator {
+                emit!(self, Instruction::BuildMap { size: 0 }); // fresh accumulator
+                have_accumulator = true;
+            }
+            if is_unpacking {
+                for keyword in sub_keywords {
+                    self.compile_expression(&keyword.value)?;
+                    emit!(self, Instruction::DictMerge { index: 1 });
+                }
+            } else {
+                let mut sub_size = 0;
+                for keyword in sub_keywords {
+                    if let Some(name) = &keyword.arg {
+                        self.emit_load_const(ConstantData::Str {
+                            value: name.as_str().into(),
+                        });
+                        self.compile_expression(&keyword.value)?;
+                        sub_size += 1;
+                    }
+                }
+                emit!(self, Instruction::BuildMap { size: sub_size });
+                emit!(self, Instruction::DictMerge { index: 1 });
+            }
+        }

🤖 Prompt for AI Agents

In `@crates/codegen/src/compile.rs` around lines 6983 - 6986, The DictMerge path
currently uses the first mapping as the accumulator which can mutate a
caller-provided **kwargs dict; change compile.rs so that when merging multiple
mappings (the loop around emit!(self, Instruction::DictMerge { index: 1 })), you
first allocate a fresh empty accumulator and merge each mapping into that fresh
map instead of using the first input as the target; specifically, before the
for-loop that emits Instruction::DictMerge, emit an instruction to create a new
empty dict (so the accumulator is new) and then merge all incoming `**` groups
into that fresh accumulator (update the code that emits Instruction::DictMerge
and the surrounding logic in the same block to use the new accumulator).

Add sys._stdlib_dir by youknowone · Pull Request #6798 · RustPython/RustPython

youknowone commented Jan 19, 2026 • edited by coderabbitai Bot Loading

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

youknowone commented
Jan 19, 2026
•

edited by coderabbitai Bot

Loading