fix pyexpat#6582
Conversation
|
Warning Rate limit exceeded@youknowone has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 5 minutes and 48 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⛔ Files ignored due to path filters (4)
📒 Files selected for processing (1)
📝 WalkthroughWalkthroughParse now accepts either string or bytes and converts to bytes before parsing. A new optional namespace_separator is stored and used by a make_name helper to produce names; intern is initialized from the provided argument. do_parse now returns Results, Parse/ParseFile surface integer status, and ParserCreate validates namespace_separator and forwards intern/namespace_separator to the parser constructor. Changes
Sequence Diagram(s)(omitted) Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Sorry, something went wrong.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
crates/stdlib/src/pyexpat.rs (1)
362-368: ParserCreate arguments are currently unused.The
encoding,namespace_separator, andinternparameters defined inParserCreateArgsare not being used in theparser_createfunction. This means XML encoding declarations and namespace handling are not currently supported. While this appears to be existing technical debt (indicated by the_argsprefix), users may expect these parameters to have an effect based on CPython's pyexpat API.Would you like me to open an issue to track implementing support for these ParserCreate parameters?
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Lib/xmlrpc/client.pyis excluded by!Lib/**
📒 Files selected for processing (1)
crates/stdlib/src/pyexpat.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/stdlib/src/pyexpat.rs
🧠 Learnings (1)
📚 Learning: 2025-12-27T14:03:49.011Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.011Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/stdlib/src/pyexpat.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
🔇 Additional comments (2)
crates/stdlib/src/pyexpat.rs (2)
52-52: LGTM: Import additions support the new functionality.The addition of
PyBytesRefandEitherimports correctly enables the Parse method to accept both string and bytes inputs.Also applies to: 54-54
354-360: Option with #[pyarg(any, optional)] is a valid pattern in RustPython.The codebase uses both
OptionalArg<T>andOption<T>with#[pyarg(any, optional)]. WhileOptionalArgis more commonly used,Optionis also a supported and legitimate pattern, as evidenced by its use in multiple stdlib modules (codecs.rs,sys.rs,lzma.rs,openssl.rs) and elsewhere. The change is consistent with established patterns in the codebase.
Sorry, something went wrong.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/stdlib/src/pyexpat.rs (1)
114-152: Validate the intern parameter type.The
internparameter is accepted asPyObjectRefwithout type validation. If a non-dict object is passed, it could cause runtime errors when dict operations are attempted later. Consider validating thatinternis a dict type whenSomeis provided.🔎 Proposed validation
fn new( namespace_separator: Option<String>, intern: Option<PyObjectRef>, vm: &VirtualMachine, ) -> PyResult<PyExpatLikeXmlParserRef> { - let intern_dict = intern.unwrap_or_else(|| vm.ctx.new_dict().into()); + let intern_dict = match intern { + Some(obj) => { + // Verify it's a dict + if !obj.is(&vm.ctx.types.dict_type) { + return Err(vm.new_type_error("intern must be a dictionary".to_owned())); + } + obj + } + None => vm.ctx.new_dict().into(), + }; Ok(Self {
♻️ Duplicate comments (2)
crates/stdlib/src/pyexpat.rs (2)
342-362: Silently ignoring all parsing errors may hide legitimate issues.While the comment indicates this is for compatibility with existing Python code, silently ignoring all errors from
do_parse(line 360) means legitimate parsing errors are never surfaced to the caller. Consider at least logging errors or handling specific error types that should be propagated.This concern was previously raised in past review comments.
365-377: Same error handling concern as in Parse.Like the
parsemethod, this silently ignores all parsing errors (line 375). The same concerns about hiding legitimate issues apply here.
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (4)
Lib/test/test_pyexpat.pyis excluded by!Lib/**Lib/test/test_xml_dom_xmlbuilder.pyis excluded by!Lib/**Lib/test/test_xmlrpc.pyis excluded by!Lib/**Lib/xmlrpc/client.pyis excluded by!Lib/**
📒 Files selected for processing (1)
crates/stdlib/src/pyexpat.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.rs: Follow the default rustfmt code style by runningcargo fmtto format Rust code
Always run clippy to lint code (cargo clippy) before completing tasks and fix any warnings or lints introduced by changes
Follow Rust best practices for error handling and memory management
Use the macro system (pyclass,pymodule,pyfunction, etc.) when implementing Python functionality in Rust
Files:
crates/stdlib/src/pyexpat.rs
🧠 Learnings (4)
📚 Learning: 2025-12-24T23:47:38.444Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 0
File: :0-0
Timestamp: 2025-12-24T23:47:38.444Z
Learning: CPython's xml.parsers.expat xmlparser exposes a namespace_prefixes boolean attribute (implemented in the pyexpat C module) that controls namespace prefix reporting via XML_SetReturnNSTriplet. This attribute is used internally by xml.dom.expatbuilder and xml.etree.ElementTree.
Applied to files:
crates/stdlib/src/pyexpat.rs
📚 Learning: 2025-11-10T06:27:41.954Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 6243
File: vm/src/function/buffer.rs:123-129
Timestamp: 2025-11-10T06:27:41.954Z
Learning: In vm/src/function/buffer.rs, line 36 in the try_rw_bytes_like method intentionally uses TypeError (not BufferError) for the "buffer is not a read-write bytes-like object" error case, even though a similar error message in ArgMemoryBuffer::try_from_borrowed_object uses BufferError. The TypeError is the intended exception type for this specific code path.
Applied to files:
crates/stdlib/src/pyexpat.rs
📚 Learning: 2025-12-27T14:03:49.011Z
Learnt from: CR
Repo: RustPython/RustPython PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-12-27T14:03:49.011Z
Learning: Applies to **/*.rs : Use the macro system (`pyclass`, `pymodule`, `pyfunction`, etc.) when implementing Python functionality in Rust
Applied to files:
crates/stdlib/src/pyexpat.rs
📚 Learning: 2025-12-24T23:47:38.444Z
Learnt from: youknowone
Repo: RustPython/RustPython PR: 0
File: :0-0
Timestamp: 2025-12-24T23:47:38.444Z
Learning: CPython's xml.parsers.expat xmlparser does not expose an intern attribute. String interning for XML parsing in CPython is handled through SAX's expatreader feature_string_interning mechanism, not as a direct xmlparser.intern attribute.
Applied to files:
crates/stdlib/src/pyexpat.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests on wasm-wasi
- GitHub Check: Run tests under miri
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Check Rust code with clippy
🔇 Additional comments (5)
crates/stdlib/src/pyexpat.rs (5)
52-54: LGTM!The new imports are necessary for accepting both string and bytes input in the parse method.
69-70: LGTM!The
namespace_separatorfield is properly added with the correct#[pytraverse(skip)]annotation for non-Python object data.
291-297: LGTM!The
make_namehelper correctly constructs element names with namespace prefixes when both separator and namespace are present.
299-339: LGTM!The error handling is now properly structured with
Resultreturn type, andmake_nameis consistently used for namespace-aware naming.
381-388: LGTM!The change from
OptionalArg<T>toOption<T>is appropriate here since these are parsed arguments with#[pyarg(any, optional)]. Theinterntype change toPyObjectRefis also correct since it accepts dict objects.
Sorry, something went wrong.
f91ffe3
into
RustPython:main
Dec 29, 2025
Summary by CodeRabbit
New Features
Bug Fixes / Behavior Changes
✏️ Tip: You can customize this high-level summary in your review settings.