fix scandir/lstat for windows by youknowone · Pull Request #6357 · RustPython/RustPython
Walkthrough
Changes to crates/vm/src/stdlib/os.rs optimize Windows directory iteration by pre-caching lstat metadata in DirEntry objects using OnceCell. The public lstat function signature changed from accepting OsPathOrFd<'_> to OsPath, with implementation forwarding to stat with FollowSymlinks(false). Platform-conditional cache initialization handles Windows and non-Windows paths separately.
Changes
| Cohort / File(s) | Summary |
|---|---|
Windows lstat pre-caching in directory iteration crates/vm/src/stdlib/os.rs |
Introduced Windows-specific pre-caching of lstat data during scandir iteration via win32_xstat metadata, storing result in OnceCell per entry. DirEntry construction now uses locally computed pathval and assigns pre-cached lstat cell. Non-Windows platforms initialize lstat as empty OnceCell. Public lstat function signature changed from OsPathOrFd<'_> to OsPath, with call forwarding to stat(..., FollowSymlinks(false), vm). |
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~20 minutes
Areas requiring extra attention:
- Public API signature change for
lstat(): verify all call sites are compatible with the newOsPath-only parameter - Windows-specific pre-caching logic: ensure win32_xstat integration correctly populates OnceCell and handles edge cases (removed files, permission errors)
- Platform-conditional initialization: confirm non-Windows codepaths maintain original OnceCell behavior and performance characteristics
- DirEntry construction: validate that pre-cached vs. empty lstat assignments don't introduce cache inconsistencies or memory leaks
Possibly related PRs
- Fix os.remove #6352: Modifies the same DirEntry/ScandirIterator and Windows-specific scandir/lstat handling with DirEntry construction and Windows metadata logic changes.
- windows umask, win32_xstat_slow_impl, fake EXT_SUFFIX #6340: Implements the win32_xstat functionality that this PR integrates into lstat pre-caching for directory iteration.
Poem
🐰 Hops through Windows paths with glee,
Caching stat before it's free,
OnceCell guards the metadata's keep,
While scandir's cache runs swift and deep! ✨
Pre-merge checks and finishing touches
✅ Passed checks (3 passed)
| Check name | Status | Explanation |
|---|---|---|
| Description Check | ✅ Passed | Check skipped - CodeRabbit’s high-level summary is enabled. |
| Title check | ✅ Passed | The title directly describes the main changes: fixing scandir/lstat functionality for Windows platform. |
| Docstring Coverage | ✅ Passed | No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check. |
✨ Finishing touches
- 📝 Generate docstrings
🧪 Generate unit tests (beta)
- Create PR with unit tests
- Post copyable unit tests in a comment
📜 Recent review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Lib/test/test_os.pyis excluded by!Lib/**
📒 Files selected for processing (1)
crates/vm/src/stdlib/os.rs(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/os.rs
🧠 Learnings (3)
📓 Common learnings
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration using #ifdef checks rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/os.rs
📚 Learning: 2025-06-27T14:47:28.810Z
Learnt from: moreal
Repo: RustPython/RustPython PR: 5847
File: vm/src/stdlib/stat.rs:547-567
Timestamp: 2025-06-27T14:47:28.810Z
Learning: In RustPython's stat module implementation, platform-specific constants like SF_SUPPORTED and SF_SYNTHETIC should be conditionally declared only for the platforms where they're available (e.g., macOS), following CPython's approach of optional declaration rather than providing fallback values for other platforms.
Applied to files:
crates/vm/src/stdlib/os.rs
🧬 Code graph analysis (1)
crates/vm/src/stdlib/os.rs (1)
crates/vm/src/windows.rs (1)
win32_xstat(86-92)
🔇 Additional comments (3)
crates/vm/src/stdlib/os.rs (3)
723-740: Windows lstat pre-caching optimization looks correct.The approach of pre-caching lstat metadata from directory entry on Windows is appropriate. Silently ignoring
win32_xstatfailures is acceptable sinceDirEntry.stat()will fall back to computing on demand when the cell is empty. This aligns with CPython's optimization strategy for Windows scandir.
742-754: DirEntry construction properly uses pre-cached lstat.The
pathvalextraction moved earlier to share with the pre-caching logic, and thelstatfield correctly uses the pre-cached OnceCell. Thestatfield remains separately cached since it may differ when following symlinks.
994-1001: Signature change correctly aligns with CPython'sos.lstat()specification.Python's
os.lstat()accepts only path-like objects as the primary argument—it does not support file descriptors as the path. For file descriptor access, Python providesos.fstat(fd). The previousOsPathOrFd<'_>parameter was overly permissive and inconsistent with CPython's API. This change correctly restrictslstat()to paths only, matching the Python specification.
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.