Add more unicode functions to c-api by bschoenmaeckers · Pull Request #8044 · RustPython/RustPython
No actionable comments were generated in the recent review. 🎉
ℹ️ Recent review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
Run ID: 8ba77755-8543-4cae-89d2-bc2800ebb7d5
📒 Files selected for processing (1)
crates/capi/src/unicodeobject.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/capi/src/unicodeobject.rs
📝 Walkthrough
Walkthrough
This PR extends the RustPython C-API layer by adding four new Unicode FFI functions that expose UTF-8 encoding, filesystem-default codec operations, and bytes-to-string decoding. Import statements are consolidated, and disabled test cases cover the new functionality for interning, UTF-8 wrapping, bytes decoding, and filesystem codec round-trips.
Changes
Unicode C-API FFI Extensions
| Layer / File(s) | Summary |
|---|---|
Unicode FFI functions and imports crates/capi/src/unicodeobject.rs |
Four new public C-API functions added: PyUnicode_AsUTF8String encodes to UTF-8, PyUnicode_DecodeFSDefaultAndSize and PyUnicode_EncodeFSDefault handle filesystem-default codec operations, and PyUnicode_FromEncodedObject decodes from bytes-like objects. All functions handle null inputs, validate parameters, downcast to PyStr where needed, and delegate encoding/decoding to the codec registry. Imports are consolidated at the file top. |
Unicode encoding and decoding tests crates/capi/src/unicodeobject.rs |
Test suite (currently disabled) validates string interning, UTF-8 encoding via the wrapper, decoding from encoded bytes objects, and round-trip filesystem-default encoding/decoding on Unix for non-UTF-8 and UTF-8 filenames. |
Estimated code review effort
🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
- RustPython/RustPython#7904: Modifies the same Unicode FFI module to add
PyUnicode_AsEncodedStringand other C-API encoding functions with similar null-handling and codec-registry delegation patterns.
Suggested reviewers
- youknowone
Poem
🐰 A rabbit hops through Unicode streams,
Encoding strings in filesystem dreams—
UTF-8 paths and bytes set free,
Four new functions, tested with glee! 🌟
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
| Check name | Status | Explanation |
|---|---|---|
| Description Check | ✅ Passed | Check skipped - CodeRabbit’s high-level summary is enabled. |
| Title check | ✅ Passed | The title 'Add more unicode functions to c-api' accurately describes the main change: adding new Unicode-related FFI entrypoints (PyUnicode_AsUTF8String, PyUnicode_DecodeFSDefaultAndSize, PyUnicode_EncodeFSDefault, PyUnicode_FromEncodedObject) to the C-API module. |
| Docstring Coverage | ✅ Passed | Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%. |
| Linked Issues check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
| Out of Scope Changes check | ✅ Passed | Check skipped because no linked issues were found for this pull request. |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches
🧪 Generate unit tests (beta)
- Create PR with unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.