initial sandbox #7035
Conversation
📝 WalkthroughWalkthroughAdds a new Cargo feature flag Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI / Binary
participant VM as VirtualMachine
participant FS as Filesystem
participant Loader as Importlib/Loader
participant StdLib as stdlib modules
Note over CLI,VM: host_env enabled?
CLI->>VM: run_any_file(path) (host_env)
VM->>FS: stat/read path
FS-->>VM: path type (.py / dir / .pyc)
VM->>Loader: choose loader / set_main_loader
Loader-->>VM: code object
VM->>StdLib: initialize modules (host_env gated)
VM->>VM: execute code
VM-->>CLI: exit status
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Sorry, something went wrong.
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin sandbox |
Sorry, something went wrong.
c1c187b to
5069248
Compare
February 8, 2026 01:44
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/exceptions.rs (1)
2148-2158: ⚠️ Potential issue | 🟡 MinorSetter allows storing sentinel value
-1, causing subsequent access to raiseAttributeError.If Python code sets
exc.characters_written = -1, the setter stores it without validation. The getter then treats-1as a sentinel for "not set" and raisesAttributeError. This diverges from CPython, where any explicitly set integer value (including-1) is retrievable as-is. The setter should validate that input values differ from the sentinel before storing.
🤖 Fix all issues with AI agents
In `@crates/vm/src/lib.rs`:
- Around line 70-71: The import of the ospath module in
crates/vm/src/stdlib/io.rs must be conditionally compiled to match the module
declaration (pub mod ospath) which is #[cfg(feature = "host_env")] in lib.rs;
locate the import statement that references ospath (the use/mod of ospath around
line ~5059 in stdlib/io.rs) and prefix it with #[cfg(feature = "host_env")] so
the import is only compiled when the host_env feature is enabled.
In `@crates/vm/src/stdlib/mod.rs`:
- Around line 101-104: The two cfg attributes both enable posix::module_def(ctx)
on WASI causing a duplicate; update the second attribute so it excludes WASI
(e.g. change #[cfg(all(feature = "host_env", not(any(unix, windows)))] to
exclude target_os = "wasi", such as #[cfg(all(feature = "host_env",
not(any(unix, windows, target_os = "wasi"))))]) so posix::module_def(ctx) is
only added once.
In `@crates/vm/src/stdlib/sys.rs`:
- Around line 122-132: The readline implementation currently ignores the
OptionalArg<isize> _size; update TextIOBase::readline (function readline in
sys.rs) to explicitly reject non-default size values instead of ignoring them:
before reading from stdin, check if _size is Present (i.e., Some) and if so
return an appropriate VM error (e.g., a NotImplementedError or ValueError
created via VirtualMachine like vm.new_not_implemented or vm.new_value_error)
with a clear message such as "readline with size not supported in sandbox";
otherwise proceed with the existing fd check and stdin.read_line flow.
- Around line 108-120: The write method currently swallows IO errors; change the
write implementation in the Sys* struct's #[pymethod] fn write(&self, s:
PyStrRef, vm: &VirtualMachine) -> PyResult<usize> to propagate errors from
stdout/stderr by replacing the `let _ = ...` ignores with
`.write_all(bytes).map_err(|e| vm.new_os_error(e.to_string()))?` so any write
failures become Py OSError; likewise implement a #[pymethod] fn flush(&self, vm:
&VirtualMachine) -> PyResult<()> (add the vm parameter) and call the appropriate
stream's `flush().map_err(|e| vm.new_os_error(e.to_string()))?` instead of
ignoring errors, returning Ok(()) on success.
🧹 Nitpick comments (8)
src/lib.rs (1)
176-189: Sandbox fallback design is sound but loses OS error detail.The sandbox path correctly keeps filesystem I/O in the binary (trusted host) while the VM only sees source strings. Two minor observations:
std::fs::read_to_stringassumes UTF-8, which is the Python 3 default source encoding — fine for the common case.- Line 187:
err.to_string()discards the originalio::ErrorKind/errno. Consider preserving it if the VM'snew_os_errorsupports it, though this is low priority for a sandbox fallback.Optional: preserve errno in sandbox error path
- Err(err) => Err(vm.new_os_error(err.to_string())), + Err(err) => { + let msg = format!("Can't open file '{}': {}", path, err); + Err(vm.new_os_error(msg)) + }extra_tests/snippets/sandbox_smoke.py (1)
19-21: Minor:sys.stderris not tested for writability.stdin/stdout are tested but stderr is mentioned in the docstring (line 4) and available via
SandboxStdio. Consider addingassert sys.stderr.writable()for completeness.Optional: add stderr assertion
assert sys.stdout.writable() assert sys.stdin.readable() +assert sys.stderr.writable() assert sys.stdout.fileno() == 1crates/vm/src/vm/python_run.rs (2)
149-155:Read::readmay short-read even on regular files.
file.read(&mut buf)?is not guaranteed to fill the 2-byte buffer in a single call. While exceedingly rare for regular files,Read::read_exactwould be the robust choice here.Suggested fix
- use std::io::Read; - if file.read(&mut buf)? != 2 { + use std::io::Read; + if file.read_exact(&mut buf).is_err() {
109-112: Commented-out deprecation onrun_script.The
#[deprecated]attribute is commented out. If this is intentional for now, consider adding a brief// TODOexplaining when the deprecation should be activated. Otherwise it's easy for this to be forgotten.crates/vm/src/exceptions.rs (4)
1307-1345:build()relies onexpect()for infallible paths — verify they truly can't fail.Lines 1338, 1341, and 1343 each call
.expect("new_os_error usage error"). Whilepy_newalways returnsOk(...)for the current implementation andinto_ref_with_typeshould succeed when using the sameexc_type,slot_initis more complex and could theoretically fail (e.g., on a subclass that overrides__init__). If any of these paths fail, the panic will be hard to diagnose.Consider whether
slot_initcould fail for exotic subclasses and, if so, whether returning aPyResultfrombuildwould be more appropriate. If these are truly infallible, a brief comment on why would help future maintainers.
1320-1335: On non-Windows, a dummyNoneis injected at position 3 (winerror slot) into the args vec.When
errnoisSome, the args vector is always 5 elements:[errno, strerror, filename, winerror_or_none, filename2]. On non-Windows,winerrorisvm.ctx.none(). This meansslot_initreceives 5 args on all platforms — which works becauseslot_initonly processesargs[3]inside a#[cfg(windows)]block. However, this couples the builder's arg layout toslot_init's conditional parsing.This is fine as-is, but a brief comment noting that the 4th position is always
Noneon non-Windows (and ignored byslot_init) would prevent confusion.
1375-1385:libc::strerroris not guaranteed thread-safe.On Unix,
libc::strerrormay return a pointer to a shared static buffer (for unknown errno values). While most modern libc implementations use static string literals for known errno values, the general case is not thread-safe. Since the result is immediately copied viato_string_lossy().into_owned(), the risk is low, but it's worth noting.If thread-safety is a concern,
strerror_r(POSIX) orstrerror_lwould be safer alternatives, though they come with their own portability complications.
1926-1926: Minor: Cloning the atomic ref just to check forNone.
exc.filename.to_owned()clones thePyObjectRefbehind the atomic ref only to checkis_none. Consider using a read/deref approach if available to avoid the unnecessary clone:-let has_filename = exc.filename.to_owned().filter(|f| !vm.is_none(f)).is_some(); +let has_filename = exc.filename.deref().is_some_and(|f| !vm.is_none(f));(Depends on whether
deref()returns a suitable reference — adjust as needed for thePyAtomicRefAPI.)
Sorry, something went wrong.
ShaharNaveh
left a comment
There was a problem hiding this comment.
I like the idea a lot
Sorry, something went wrong.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/vm/mod.rs (1)
326-394: ⚠️ Potential issue | 🔴 CriticalFix conflicting cfg conditions for
make_stdiobranches.The
host_envandstdiofeatures are independent in Cargo.toml (neither depends on the other), but the three cfg branches are not mutually exclusive. Whenhost_envis enabled andstdiois disabled, both the first branch (#[cfg(feature = "host_env")]) and the third branch (#[cfg(not(feature = "stdio"))]) match simultaneously, causing a duplicatemake_stdiobinding and a compile error.Change the third condition from
#[cfg(not(feature = "stdio"))]to#[cfg(all(not(feature = "host_env"), not(feature = "stdio")))]to ensure all three branches are truly mutually exclusive.
🤖 Fix all issues with AI agents
In `@crates/vm/src/stdlib/sys.rs`:
- Around line 126-141: The readline implementation in SandboxStdio::readline
treats the size parameter as a byte index and calls line.truncate(size as
usize), which can panic on UTF‑8 boundaries for non‑ASCII input; change the
truncation to operate on character count instead of raw bytes—compute the byte
boundary for the first N characters (size) or rebuild the string with
line.chars().take(size as usize).collect(), then return that safely truncated
String while keeping the existing checks (handle size <= 0 and OptionalArg
unwrap as before).
🧹 Nitpick comments (1)
crates/vm/src/exceptions.rs (1)
1375-1385:libc::strerroris not guaranteed thread-safe on all platforms.On some platforms,
strerrorreturns a pointer to a shared static buffer. While glibc uses thread-local storage, POSIX doesn't guarantee thread-safety. The immediate.to_string_lossy().into_owned()conversion minimizes the risk window, butstrerror_r(or Rust'sstd::io::Error::from_raw_os_error(errno).to_string()) would be a safer alternative.
Sorry, something went wrong.
Introduce a `host_env` feature flag that gates all host environment access (filesystem, network, signals, processes). When disabled, the VM operates in sandbox mode: - _io module always available; FileIO gated by host_env - SandboxStdio provides lightweight stdin/stdout/stderr via Rust std::io - BytesIO/StringIO/BufferedIO/TextIOWrapper work without host_env - open() returns UnsupportedOperation in sandbox - stdlib modules (os, socket, signal, etc.) gated by host_env - CI checks both host_env ON and OFF builds
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @.github/workflows/ci.yaml:
- Around line 156-167: Update the CI job to build with the compiler feature and
run sandbox snippets using cargo run instead of invoking the binary directly:
add the compiler feature to the build line (the step that currently runs cargo
build --no-default-features --features
stdlib,importlib,stdio,encodings,freeze-stdlib) so it becomes consistent with
the check steps, and replace the two direct binary executions
(target/debug/rustpython extra_tests/snippets/sandbox_smoke.py and
target/debug/rustpython extra_tests/snippets/stdlib_re.py) in the "sandbox smoke
test" step with cargo run --bin rustpython -- <snippet-path> (run in debug mode)
to follow the extra_tests/snippets guideline and ensure faster, reproducible
builds.
In `@crates/vm/src/exceptions.rs`:
- Around line 1354-1398: The impl of ToOSErrorBuilder for std::io::Error
currently always calls OSErrorBuilder::with_errno(errno, msg, vm) even when
posix_errno() returns 0; change to check the errno returned by posix_errno() and
only call with_errno when errno > 0, otherwise call
OSErrorBuilder::with_subtype(py_exception_type_for_os_error, None, msg, vm)
(preserving the msg) so we don't emit OSError(errno=0); update the
to_os_error_builder function around the creation of builder (the let mut builder
= ... and subsequent #[cfg(windows)] winerror handling) to branch on errno
accordingly while keeping the winerror attachment when available.
🧹 Nitpick comments (1)
crates/vm/src/signal.rs (1)
82-83: Nit: inconsistent#[cfg]style withset_interrupt_exbelow.Lines 82–83 use two stacked
#[cfg(...)]attributes, while line 103 combines them into a single#[cfg(all(...))]. Both are semantically equivalent, but using a consistent style improves readability.♻️ Suggested consolidation
-#[cfg(unix)] -#[cfg(feature = "host_env")] +#[cfg(all(unix, feature = "host_env"))] pub(crate) fn clear_after_fork() {
Sorry, something went wrong.
f777416
into
RustPython:main
Feb 8, 2026
#4210
Summary by CodeRabbit
New Features
Tests / Chores