◐ Shell
reader mode source ↗
Skip to content

initial sandbox #7035

Merged
youknowone merged 2 commits into
RustPython:mainfrom
youknowone:sandbox
Feb 8, 2026
Merged

initial sandbox #7035
youknowone merged 2 commits into
RustPython:mainfrom
youknowone:sandbox

Conversation

@youknowone

@youknowone youknowone commented Feb 7, 2026

Copy link
Copy Markdown
Member

#4210

Summary by CodeRabbit

  • New Features

    • Introduced a host vs sandbox runtime mode: many stdlib modules (os, socket, signal, file I/O, ssl, etc.) are now available only in host mode; sandbox mode is restricted.
    • Added a lightweight sandbox stdio implementation for non-host environments.
  • Tests / Chores

    • CI now validates sandbox builds and runs a new sandbox smoke test to verify runtime behavior.

@coderabbitai

coderabbitai Bot commented Feb 7, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Adds a new Cargo feature flag host_env; gates many host-specific stdlib modules, file-execution, signal and stdio behavior behind it; centralizes OSError construction via an OSErrorBuilder path; introduces SandboxStdio for non-host builds; adds CI sandbox checks and a sandbox smoke test.

Changes

Cohort / File(s) Summary
Workspace & Crate Features
Cargo.toml, crates/vm/Cargo.toml, crates/stdlib/Cargo.toml
Introduce host_env feature and add it to default feature sets.
Stdlib Module Gating
crates/stdlib/src/lib.rs, crates/vm/src/stdlib/mod.rs, crates/vm/src/stdlib/os.rs, crates/vm/src/stdlib/io.rs, crates/vm/src/stdlib/sys.rs, crates/vm/src/stdlib/thread.rs
Replace many platform-only cfgs with feature = "host_env" (often combined with OS checks); add/remove module gates and adjust exported module_def entries; add SandboxStdio-related sys changes.
VM Core & Initialization
crates/vm/src/vm/mod.rs, crates/vm/src/vm/python_run.rs, crates/vm/src/lib.rs, src/lib.rs
Gate file-run and stdio construction behind host_env; provide sandbox fallback stdio; move file-execution helpers into a host-gated file_run module; adjust importlib initialization gating.
Exceptions & Conversions
crates/vm/src/exceptions.rs, crates/vm/src/stdlib/os.rs, crates/vm/src/stdlib/io.rs
Centralize OS error → Python exception construction using a new OSErrorBuilder and ToOSErrorBuilder paths; add IntoPyException/ToPyException impls funneling through builder; remove duplicate/older impls from stdlib modules.
Import System
crates/vm/src/import.rs
Add #[cfg(feature = "host_env")] init_importlib_package and move TryFromObject/PyListRef imports/usage into the host-gated function; make OS builtin import unconditional.
Signal & Interrupts
crates/vm/src/signal.rs
Tighten cfgs for clear_after_fork and set_interrupt_ex to require feature = "host_env" alongside existing platform gates.
Testing & CI & Misc
extra_tests/snippets/sandbox_smoke.py, .github/workflows/ci.yaml, .cspell.json
Add a sandbox smoke test, CI sandbox compilation and smoke-test steps (Linux-only), and add "rustix" to cspell dictionary.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as CLI / Binary
  participant VM as VirtualMachine
  participant FS as Filesystem
  participant Loader as Importlib/Loader
  participant StdLib as stdlib modules

  Note over CLI,VM: host_env enabled?
  CLI->>VM: run_any_file(path) (host_env)
  VM->>FS: stat/read path
  FS-->>VM: path type (.py / dir / .pyc)
  VM->>Loader: choose loader / set_main_loader
  Loader-->>VM: code object
  VM->>StdLib: initialize modules (host_env gated)
  VM->>VM: execute code
  VM-->>CLI: exit status
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • fanninpm
  • coolreader18

Poem

🐇 I hopped into code with a curious grin,
gates named host_env let host modules in.
Builders craft OSErrors, sandbox sleeps yet sings,
CI runs the smoke test, and tiny joy it brings.
✨ — a rabbit’s thump for neat, gated things.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The PR title 'initial sandbox' is vague and generic. While sandbox functionality is implemented throughout, the title lacks specificity about what 'initial sandbox' means—it could refer to a feature gate, a new sandbox mode, or foundational infrastructure. Consider revising to a more descriptive title such as 'Add host_env feature gate for sandbox mode' or 'Introduce sandbox execution environment' to clarify the primary change.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Feb 7, 2026

Copy link
Copy Markdown
Contributor

Code has been automatically formatted

The code in this PR has been formatted using:

  • ruff check --select I --fix
    Please pull the latest changes before pushing again:
git pull origin sandbox

@youknowone youknowone force-pushed the sandbox branch 2 times, most recently from c1c187b to 5069248 Compare February 8, 2026 01:44
@youknowone youknowone marked this pull request as ready for review February 8, 2026 07:11
@youknowone youknowone changed the title sandboxing Feb 8, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/exceptions.rs (1)

2148-2158: ⚠️ Potential issue | 🟡 Minor

Setter allows storing sentinel value -1, causing subsequent access to raise AttributeError.

If Python code sets exc.characters_written = -1, the setter stores it without validation. The getter then treats -1 as a sentinel for "not set" and raises AttributeError. This diverges from CPython, where any explicitly set integer value (including -1) is retrievable as-is. The setter should validate that input values differ from the sentinel before storing.

🤖 Fix all issues with AI agents
In `@crates/vm/src/lib.rs`:
- Around line 70-71: The import of the ospath module in
crates/vm/src/stdlib/io.rs must be conditionally compiled to match the module
declaration (pub mod ospath) which is #[cfg(feature = "host_env")] in lib.rs;
locate the import statement that references ospath (the use/mod of ospath around
line ~5059 in stdlib/io.rs) and prefix it with #[cfg(feature = "host_env")] so
the import is only compiled when the host_env feature is enabled.

In `@crates/vm/src/stdlib/mod.rs`:
- Around line 101-104: The two cfg attributes both enable posix::module_def(ctx)
on WASI causing a duplicate; update the second attribute so it excludes WASI
(e.g. change #[cfg(all(feature = "host_env", not(any(unix, windows)))] to
exclude target_os = "wasi", such as #[cfg(all(feature = "host_env",
not(any(unix, windows, target_os = "wasi"))))]) so posix::module_def(ctx) is
only added once.

In `@crates/vm/src/stdlib/sys.rs`:
- Around line 122-132: The readline implementation currently ignores the
OptionalArg<isize> _size; update TextIOBase::readline (function readline in
sys.rs) to explicitly reject non-default size values instead of ignoring them:
before reading from stdin, check if _size is Present (i.e., Some) and if so
return an appropriate VM error (e.g., a NotImplementedError or ValueError
created via VirtualMachine like vm.new_not_implemented or vm.new_value_error)
with a clear message such as "readline with size not supported in sandbox";
otherwise proceed with the existing fd check and stdin.read_line flow.
- Around line 108-120: The write method currently swallows IO errors; change the
write implementation in the Sys* struct's #[pymethod] fn write(&self, s:
PyStrRef, vm: &VirtualMachine) -> PyResult<usize> to propagate errors from
stdout/stderr by replacing the `let _ = ...` ignores with
`.write_all(bytes).map_err(|e| vm.new_os_error(e.to_string()))?` so any write
failures become Py OSError; likewise implement a #[pymethod] fn flush(&self, vm:
&VirtualMachine) -> PyResult<()> (add the vm parameter) and call the appropriate
stream's `flush().map_err(|e| vm.new_os_error(e.to_string()))?` instead of
ignoring errors, returning Ok(()) on success.
🧹 Nitpick comments (8)
src/lib.rs (1)

176-189: Sandbox fallback design is sound but loses OS error detail.

The sandbox path correctly keeps filesystem I/O in the binary (trusted host) while the VM only sees source strings. Two minor observations:

  1. std::fs::read_to_string assumes UTF-8, which is the Python 3 default source encoding — fine for the common case.
  2. Line 187: err.to_string() discards the original io::ErrorKind/errno. Consider preserving it if the VM's new_os_error supports it, though this is low priority for a sandbox fallback.
Optional: preserve errno in sandbox error path
-            Err(err) => Err(vm.new_os_error(err.to_string())),
+            Err(err) => {
+                let msg = format!("Can't open file '{}': {}", path, err);
+                Err(vm.new_os_error(msg))
+            }
extra_tests/snippets/sandbox_smoke.py (1)

19-21: Minor: sys.stderr is not tested for writability.

stdin/stdout are tested but stderr is mentioned in the docstring (line 4) and available via SandboxStdio. Consider adding assert sys.stderr.writable() for completeness.

Optional: add stderr assertion
 assert sys.stdout.writable()
 assert sys.stdin.readable()
+assert sys.stderr.writable()
 assert sys.stdout.fileno() == 1
crates/vm/src/vm/python_run.rs (2)

149-155: Read::read may short-read even on regular files.

file.read(&mut buf)? is not guaranteed to fill the 2-byte buffer in a single call. While exceedingly rare for regular files, Read::read_exact would be the robust choice here.

Suggested fix
-        use std::io::Read;
-        if file.read(&mut buf)? != 2 {
+        use std::io::Read;
+        if file.read_exact(&mut buf).is_err() {

109-112: Commented-out deprecation on run_script.

The #[deprecated] attribute is commented out. If this is intentional for now, consider adding a brief // TODO explaining when the deprecation should be activated. Otherwise it's easy for this to be forgotten.

crates/vm/src/exceptions.rs (4)

1307-1345: build() relies on expect() for infallible paths — verify they truly can't fail.

Lines 1338, 1341, and 1343 each call .expect("new_os_error usage error"). While py_new always returns Ok(...) for the current implementation and into_ref_with_type should succeed when using the same exc_type, slot_init is more complex and could theoretically fail (e.g., on a subclass that overrides __init__). If any of these paths fail, the panic will be hard to diagnose.

Consider whether slot_init could fail for exotic subclasses and, if so, whether returning a PyResult from build would be more appropriate. If these are truly infallible, a brief comment on why would help future maintainers.


1320-1335: On non-Windows, a dummy None is injected at position 3 (winerror slot) into the args vec.

When errno is Some, the args vector is always 5 elements: [errno, strerror, filename, winerror_or_none, filename2]. On non-Windows, winerror is vm.ctx.none(). This means slot_init receives 5 args on all platforms — which works because slot_init only processes args[3] inside a #[cfg(windows)] block. However, this couples the builder's arg layout to slot_init's conditional parsing.

This is fine as-is, but a brief comment noting that the 4th position is always None on non-Windows (and ignored by slot_init) would prevent confusion.


1375-1385: libc::strerror is not guaranteed thread-safe.

On Unix, libc::strerror may return a pointer to a shared static buffer (for unknown errno values). While most modern libc implementations use static string literals for known errno values, the general case is not thread-safe. Since the result is immediately copied via to_string_lossy().into_owned(), the risk is low, but it's worth noting.

If thread-safety is a concern, strerror_r (POSIX) or strerror_l would be safer alternatives, though they come with their own portability complications.


1926-1926: Minor: Cloning the atomic ref just to check for None.

exc.filename.to_owned() clones the PyObjectRef behind the atomic ref only to check is_none. Consider using a read/deref approach if available to avoid the unnecessary clone:

-let has_filename = exc.filename.to_owned().filter(|f| !vm.is_none(f)).is_some();
+let has_filename = exc.filename.deref().is_some_and(|f| !vm.is_none(f));

(Depends on whether deref() returns a suitable reference — adjust as needed for the PyAtomicRef API.)

@ShaharNaveh ShaharNaveh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

I like the idea a lot

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/vm/src/vm/mod.rs (1)

326-394: ⚠️ Potential issue | 🔴 Critical

Fix conflicting cfg conditions for make_stdio branches.

The host_env and stdio features are independent in Cargo.toml (neither depends on the other), but the three cfg branches are not mutually exclusive. When host_env is enabled and stdio is disabled, both the first branch (#[cfg(feature = "host_env")]) and the third branch (#[cfg(not(feature = "stdio"))]) match simultaneously, causing a duplicate make_stdio binding and a compile error.

Change the third condition from #[cfg(not(feature = "stdio"))] to #[cfg(all(not(feature = "host_env"), not(feature = "stdio")))] to ensure all three branches are truly mutually exclusive.

🤖 Fix all issues with AI agents
In `@crates/vm/src/stdlib/sys.rs`:
- Around line 126-141: The readline implementation in SandboxStdio::readline
treats the size parameter as a byte index and calls line.truncate(size as
usize), which can panic on UTF‑8 boundaries for non‑ASCII input; change the
truncation to operate on character count instead of raw bytes—compute the byte
boundary for the first N characters (size) or rebuild the string with
line.chars().take(size as usize).collect(), then return that safely truncated
String while keeping the existing checks (handle size <= 0 and OptionalArg
unwrap as before).
🧹 Nitpick comments (1)
crates/vm/src/exceptions.rs (1)

1375-1385: libc::strerror is not guaranteed thread-safe on all platforms.

On some platforms, strerror returns a pointer to a shared static buffer. While glibc uses thread-local storage, POSIX doesn't guarantee thread-safety. The immediate .to_string_lossy().into_owned() conversion minimizes the risk window, but strerror_r (or Rust's std::io::Error::from_raw_os_error(errno).to_string()) would be a safer alternative.

Introduce a `host_env` feature flag that gates all host environment
access (filesystem, network, signals, processes). When disabled,
the VM operates in sandbox mode:

- _io module always available; FileIO gated by host_env
- SandboxStdio provides lightweight stdin/stdout/stderr via Rust std::io
- BytesIO/StringIO/BufferedIO/TextIOWrapper work without host_env
- open() returns UnsupportedOperation in sandbox
- stdlib modules (os, socket, signal, etc.) gated by host_env
- CI checks both host_env ON and OFF builds

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @.github/workflows/ci.yaml:
- Around line 156-167: Update the CI job to build with the compiler feature and
run sandbox snippets using cargo run instead of invoking the binary directly:
add the compiler feature to the build line (the step that currently runs cargo
build --no-default-features --features
stdlib,importlib,stdio,encodings,freeze-stdlib) so it becomes consistent with
the check steps, and replace the two direct binary executions
(target/debug/rustpython extra_tests/snippets/sandbox_smoke.py and
target/debug/rustpython extra_tests/snippets/stdlib_re.py) in the "sandbox smoke
test" step with cargo run --bin rustpython -- <snippet-path> (run in debug mode)
to follow the extra_tests/snippets guideline and ensure faster, reproducible
builds.

In `@crates/vm/src/exceptions.rs`:
- Around line 1354-1398: The impl of ToOSErrorBuilder for std::io::Error
currently always calls OSErrorBuilder::with_errno(errno, msg, vm) even when
posix_errno() returns 0; change to check the errno returned by posix_errno() and
only call with_errno when errno > 0, otherwise call
OSErrorBuilder::with_subtype(py_exception_type_for_os_error, None, msg, vm)
(preserving the msg) so we don't emit OSError(errno=0); update the
to_os_error_builder function around the creation of builder (the let mut builder
= ... and subsequent #[cfg(windows)] winerror handling) to branch on errno
accordingly while keeping the winerror attachment when available.
🧹 Nitpick comments (1)
crates/vm/src/signal.rs (1)

82-83: Nit: inconsistent #[cfg] style with set_interrupt_ex below.

Lines 82–83 use two stacked #[cfg(...)] attributes, while line 103 combines them into a single #[cfg(all(...))]. Both are semantically equivalent, but using a consistent style improves readability.

♻️ Suggested consolidation
-#[cfg(unix)]
-#[cfg(feature = "host_env")]
+#[cfg(all(unix, feature = "host_env"))]
 pub(crate) fn clear_after_fork() {

Hide details View details @youknowone youknowone merged commit f777416 into RustPython:main Feb 8, 2026
13 checks passed
@youknowone youknowone deleted the sandbox branch February 8, 2026 13:58
This was referenced Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants