Support pickle better with __getnewargs_ex__ by youknowone · Pull Request #6749 · RustPython/RustPython
📝 Walkthrough
Walkthrough
Refactors object pickling/reduction to add a new reduce_newobj path and helpers for extracting getnewargs/getstate/slot and item data, adds getstate/setstate to IO wrapper classes for pickling, interns getnewargs_ex in VM context, and removes a legacy Python reducelib module.
Changes
| Cohort / File(s) | Summary |
|---|---|
Object pickling & reduction crates/vm/src/builtins/object.rs |
Adds get_new_arguments, is_getstate_overridden, object_getstate, get_items_iter, reduce_newobj; refactors protocol>=2 reduction to use reduce_newobj; tighter itemsize and basicsize validation; clearer pickle error messages. |
IO wrapper pickling crates/vm/src/stdlib/io.rs |
Adds __getstate__ / __setstate__ implementations to IO wrapper classes (BytesIO, TextIOWrapper, StringIO and other wrappers), exports PyDict, and enforces state-shape validation and closed-state checks. |
VM context constants crates/vm/src/vm/context.rs |
Adds __getnewargs_ex__ field to ConstName for interned magic method name. |
Removed legacy module crates/vm/Lib/python_builtins/__reducelib.py |
Deletes the Python-side reducelib module that previously provided reduce_2/slotnames/_abstract_method_error helpers. |
Thread cleanup minor refactor crates/vm/src/stdlib/thread.rs |
Replaced map_or(false, ...) with is_some_and(...) in a retain filter; no semantic change. |
Estimated code review effort
🎯 4 (Complex) | ⏱️ ~45 minutes
Suggested reviewers
- fanninpm
Poem
🐰
I nibbled bytes and stored a state,
Gathered args to reconstruct fate,
IO trunks now pack and send,
Basicsize checked from end to end,
Hopping home with pickles great! 🥕
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
| Check name | Status | Explanation |
|---|---|---|
| Description Check | ✅ Passed | Check skipped - CodeRabbit’s high-level summary is enabled. |
| Title check | ✅ Passed | The title directly and specifically describes the main change: adding support for getnewargs_ex to improve pickle functionality, which aligns with the primary modifications across object.rs, io.rs, and context.rs. |
| Docstring Coverage | ✅ Passed | Docstring coverage is 81.25% which is sufficient. The required threshold is 80.00%. |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing touches
- 📝 Generate docstrings
📜 Recent review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (11)
Lib/test/test_copy.pyis excluded by!Lib/**Lib/test/test_csv.pyis excluded by!Lib/**Lib/test/test_descr.pyis excluded by!Lib/**Lib/test/test_enum.pyis excluded by!Lib/**Lib/test/test_lzma.pyis excluded by!Lib/**Lib/test/test_memoryio.pyis excluded by!Lib/**Lib/test/test_pickle.pyis excluded by!Lib/**Lib/test/test_pickletools.pyis excluded by!Lib/**Lib/test/test_typing.pyis excluded by!Lib/**Lib/test/test_unittest/test_runner.pyis excluded by!Lib/**Lib/test/test_zlib.pyis excluded by!Lib/**
📒 Files selected for processing (5)
crates/vm/Lib/python_builtins/__reducelib.pycrates/vm/src/builtins/object.rscrates/vm/src/stdlib/io.rscrates/vm/src/stdlib/thread.rscrates/vm/src/vm/context.rs
💤 Files with no reviewable changes (1)
- crates/vm/Lib/python_builtins/__reducelib.py
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/vm/src/vm/context.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style usingcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/vm/src/stdlib/thread.rscrates/vm/src/builtins/object.rscrates/vm/src/stdlib/io.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (windows-2025)
- GitHub Check: Check Rust code with clippy
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Check the WASM package and demo
🔇 Additional comments (12)
crates/vm/src/stdlib/io.rs (5)
161-162: LGTM!The
PyDictimport is correctly added to support the__setstate__implementations that need to restore the__dict__attribute.
4081-4103: LGTM!The
__getstate__implementation correctly captures the buffer content, position, and optional__dict__. The TODO comment appropriately documents that the actual newline setting will be stored once implemented.
4105-4144: LGTM! Previous feedback addressed.The
__setstate__implementation now includes the closed-file guard at line 4108, which was the concern raised in the previous review. The implementation correctly validates the tuple length, restores buffer content and position, and handles the__dict__restoration properly.
4294-4313: LGTM!The
__getstate__implementation correctly captures bytes content, position, and optional__dict__. Using a 3-tuple (without newline) is appropriate for binary I/O.
4315-4351: LGTM!The
__setstate__implementation correctly:
- Checks closed state before modification
- Uses
try_resizableto ensure no active buffer exports exist (line 4332)- Restores content and position
- Handles
__dict__restoration properlycrates/vm/src/stdlib/thread.rs (1)
518-523: LGTM! Cleaner idiomatic Rust.Using
is_some_andinstead ofmap_or(false, ...)is the preferred pattern for this check. Functionally equivalent with improved readability.crates/vm/src/builtins/object.rs (6)
564-624: LGTM - Proper implementation of new arguments extraction.The function correctly handles both
__getnewargs_ex__and__getnewargs__with appropriate type validation and error messages matching CPython's behavior.
626-644: LGTM!The identity comparison correctly detects whether
__getstate__is overridden by comparing method objects from the MRO.
646-656: LGTM!Correct delegation: uses the default implementation with the
requiredflag when not overridden, otherwise invokes the user-defined__getstate__.
658-674: LGTM!Correctly provides item iterators for list/dict subclasses as required by the pickle protocol for proper content restoration.
736-743: LGTM!Clean routing: protocol >= 2 uses the new
reduce_newobjpath while older protocols fall back to the copyreg helper.
187-239: LGTM - pickle validation logic is correct.The itemsize check (line 190) and basicsize calculation correctly implement pickle validation. The logic properly accounts for dict size (if HAS_DICT flag), weakref support detection (with separate handling for heap vs built-in types), and slot sizes before comparing against the actual class basicsize.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.