◐ Shell
reader mode source ↗
Skip to content

Py new#6402

Merged
youknowone merged 2 commits into
RustPython:mainfrom
youknowone:py-new
Dec 11, 2025
Merged

Py new#6402
youknowone merged 2 commits into
RustPython:mainfrom
youknowone:py-new

Conversation

@youknowone

@youknowone youknowone commented Dec 11, 2025

Copy link
Copy Markdown
Member

Summary by CodeRabbit

  • Performance
    • Enhanced constructor implementation for built-in bytes, frozenset, string, and tuple types to improve object creation performance. Optimized code paths now handle exact type instances with direct returns, efficient singleton handling for empty collections, and streamlined construction flows that reduce overhead in common usage scenarios.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai

coderabbitai Bot commented Dec 11, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

This PR refactors constructor implementations for built-in Python types (bytes, frozenset, str, tuple) in RustPython, changing how constructor arguments are represented and processed. It moves construction logic from shared helpers into type-specific optimized paths, removes the ByteInnerNewOptions.get_bytes() method, and updates the bytes protocol to use a new FuncArgs-based approach.

Changes

Cohort / File(s) Summary
Constructor argument type refactoring
crates/vm/src/builtins/bytes.rs, crates/vm/src/builtins/set.rs, crates/vm/src/builtins/tuple.rs
Changed type Args from wrapper types (ByteInnerNewOptions, OptionalArg<PyObjectRef>) to direct container types (Vec<u8>, Vec<PyObjectRef>). Added early-return optimizations in slot_new for exact types (empty singletons, direct returns for existing objects). Updated py_new signatures to accept the new Args types and construct payloads directly.
String constructor optimization
crates/vm/src/builtins/str.rs
Introduced early-return optimization for exact str type with single str argument. Refactored slot_new to delegate payload creation to new internal py_new helper, separating constructor logic from type-wrapping logic.
Shared method removal
crates/vm/src/bytes_inner.rs
Removed public get_bytes() method from ByteInnerNewOptions impl; construction logic now dispersed into type-specific paths. Incidental import cleanup.
Protocol bytes construction update
crates/vm/src/protocol/object.rs
Replaced ByteInnerNewOptions-based bytes construction with new FuncArgs-based Constructor::slot_new path. Updated imports to use FuncArgs and Constructor.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Multiple interconnected files with public API changes: Constructor type Args changes in bytes, frozenset, and tuple affect both signatures and internal logic across multiple files.
  • Optimization paths require validation: Early-return logic for empty singletons, exact type checks, and __bytes__/object-identity short-circuits must be verified for correctness.
  • Removal of shared helper: The deleted get_bytes() method shifts responsibility to individual constructors; ensure no other code paths depend on it.
  • Pattern consistency: Similar changes applied across multiple types (repetition reduces complexity somewhat), but each type has unique logic requiring individual review.

Specific areas requiring extra attention:

  • Validation of singleton and exact-type short-circuit logic in slot_new methods (bytes, frozenset, tuple)
  • Correctness of py_new implementations accepting Vec<u8> and Vec<PyObjectRef> and properly constructing payloads
  • Verification that the new FuncArgs-based bytes() flow in protocol/object.rs maintains parity with the old ByteInnerNewOptions path
  • Confirmation that removal of get_bytes() does not break any call sites outside this diff

Possibly related PRs

Suggested reviewers

  • ShaharNaveh
  • coolreader18

Poem

🐰 Hoppy constructor paths now shine so bright,
With Vec and singletons taking flight,
Exact types zoom past, no detours in sight,
While shared get_bytes fades into the night.
RustPython's types dance lean and light!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Title check ❓ Inconclusive The title 'Py new' is too vague and generic; it does not convey meaningful information about the substantive changes across multiple builtins (bytes, frozenset, str, tuple) and their constructor signatures. Use a more descriptive title that captures the main objective, such as 'Refactor constructor argument types for builtin types' or 'Introduce py_new optimization paths for bytes, frozenset, str, and tuple'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@youknowone youknowone marked this pull request as ready for review December 11, 2025 14:09

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5f496c9 and c5abc44.

📒 Files selected for processing (6)
  • crates/vm/src/builtins/bytes.rs (1 hunks)
  • crates/vm/src/builtins/set.rs (1 hunks)
  • crates/vm/src/builtins/str.rs (1 hunks)
  • crates/vm/src/builtins/tuple.rs (1 hunks)
  • crates/vm/src/bytes_inner.rs (1 hunks)
  • crates/vm/src/protocol/object.rs (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.rs: Follow the default rustfmt code style by running cargo fmt to format Rust code
Always run clippy to lint Rust code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass, pymodule, pyfunction, etc.) when implementing Python functionality in Rust

Files:

  • crates/vm/src/builtins/set.rs
  • crates/vm/src/builtins/tuple.rs
  • crates/vm/src/builtins/bytes.rs
  • crates/vm/src/builtins/str.rs
  • crates/vm/src/bytes_inner.rs
  • crates/vm/src/protocol/object.rs
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to Lib/**/*.py : Add a `# TODO: RUSTPYTHON` comment when modifications are made to files in the `Lib/` directory
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In the RustPython codebase, only certain builtin types should be marked with the SEQUENCE flag for pattern matching. List and tuple are sequences, but bytes, bytearray, and range are not considered sequences in this context, even though they may implement sequence-like protocols.
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6110
File: vm/src/frame.rs:1311-1316
Timestamp: 2025-08-26T05:20:54.540Z
Learning: In RustPython's pattern matching implementation, only certain builtin types should have the SEQUENCE flag: list and tuple are confirmed sequences. The user youknowone indicated that bytes, bytearray are not considered sequences in this context, even though they implement sequence-like protocols.
📚 Learning: 2025-11-29T12:17:28.606Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-29T12:17:28.606Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust

Applied to files:

  • crates/vm/src/protocol/object.rs
🧬 Code graph analysis (4)
crates/vm/src/builtins/set.rs (5)
crates/vm/src/builtins/bytes.rs (2)
  • slot_new (97-149)
  • py_new (151-153)
crates/vm/src/builtins/tuple.rs (3)
  • slot_new (115-146)
  • input (122-122)
  • py_new (148-152)
crates/vm/src/builtins/complex.rs (1)
  • slot_new (154-167)
crates/vm/src/builtins/float.rs (1)
  • slot_new (134-147)
crates/vm/src/builtins/list.rs (1)
  • from_iter (47-49)
crates/vm/src/builtins/bytes.rs (3)
crates/vm/src/builtins/str.rs (5)
  • slot_new (353-366)
  • py_new (368-386)
  • elements (2118-2120)
  • elements (2229-2231)
  • elements (2351-2353)
crates/vm/src/builtins/tuple.rs (2)
  • slot_new (115-146)
  • py_new (148-152)
crates/vm/src/builtins/float.rs (1)
  • slot_new (134-147)
crates/vm/src/builtins/str.rs (1)
crates/vm/src/stdlib/typing.rs (1)
  • args (157-159)
crates/vm/src/protocol/object.rs (2)
crates/vm/src/builtins/bytes.rs (4)
  • from (44-48)
  • from (52-54)
  • obj (114-114)
  • slot_new (97-149)
crates/vm/src/builtins/object.rs (1)
  • slot_new (35-111)
🔇 Additional comments (9)
crates/vm/src/bytes_inner.rs (1)

3-7: LGTM - Import cleanup aligns with removed functionality.

The import changes (removing identifier, PyTypeRef, and simplifying pystr import) are consistent with the removal of the get_bytes() method that previously depended on these. The remaining imports support the get_bytearray_inner() method which is still used by the bytes constructor.

crates/vm/src/builtins/str.rs (2)

353-366: LGTM - Optimization for exact str type is correct.

The early return optimization correctly:

  1. Checks for exact str type (not subclasses)
  2. Ensures single positional argument with no kwargs
  3. Verifies the argument is already an exact str

This matches CPython's behavior and avoids unnecessary string copying.


368-386: LGTM - Clean separation of payload creation logic.

The py_new helper correctly handles all construction cases:

  • Decoding with encoding parameter
  • Converting via str() without encoding
  • Empty string for missing input

The separation from slot_new improves code clarity and follows the pattern established in other builtins.

crates/vm/src/builtins/set.rs (1)

921-958: LGTM - Constructor refactor follows the established pattern.

The implementation correctly:

  1. Returns exact frozenset input as-is (line 929-933)
  2. Returns empty frozenset singleton for missing iterable (line 936-938)
  3. Returns empty frozenset singleton for empty result (line 947-950)
  4. Delegates payload creation to py_new which uses from_iter

The type Args = Vec<PyObjectRef> declaration aligns with py_new's signature, while slot_new handles the conversion from FuncArgs to the expected elements vector.

crates/vm/src/protocol/object.rs (2)

7-16: LGTM - Import updates align with new construction pathway.

The import changes correctly add:

  • FuncArgs for constructing function arguments
  • Constructor trait for invoking the slot_new pathway

These support the refactored bytes() method below.


34-43: LGTM - Cleaner construction pathway via Constructor trait.

The refactored bytes() method correctly:

  1. Maintains the PyInt error handling (bytes shouldn't accept int directly via this path)
  2. Creates FuncArgs from the single object argument
  3. Delegates to PyBytes::slot_new which handles all construction logic including __bytes__ method invocation and optimizations

This centralizes the bytes construction logic and eliminates the now-removed get_bytes() pathway.

crates/vm/src/builtins/tuple.rs (1)

112-152: LGTM - Constructor follows the unified pattern across builtins.

The implementation correctly handles all optimization cases:

  1. Returns exact input tuple as-is (lines 121-125)
  2. Returns empty tuple singleton for missing iterable (lines 128-130)
  3. Returns empty tuple singleton for empty result (lines 139-142)
  4. Properly separates slot binding (slot_new) from payload creation (py_new)

The type Args = Vec<PyObjectRef> with py_new accepting elements: Self::Args provides a clean interface while slot_new handles the FuncArgs to Vec conversion.

crates/vm/src/builtins/bytes.rs (2)

94-118: Constructor optimizations look correct and consistent with other builtin types.

The optimization pattern correctly mirrors other builtin constructors (tuple, str, float):

  • Empty singleton returned for exact type with no arguments
  • Exact input bytes returned as-is when no encoding/errors specified
  • Use of downcast_exact properly ensures only exact PyBytes instances bypass construction

139-153: Fallback path and simplified py_new look good.

The structure cleanly separates concerns:

  • slot_new handles all optimization logic and protocol dispatch
  • py_new is reduced to simple construction from Vec<u8>
  • Empty singleton optimization in the fallback path matches the pattern in tuple.rs

Hide details View details @youknowone youknowone merged commit a477835 into RustPython:main Dec 11, 2025
32 of 35 checks passed
@youknowone youknowone deleted the py-new branch December 11, 2025 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant