marshal by youknowone · Pull Request #7467 · RustPython/RustPython
Unify marshal to a single CPython-compatible format. No separate "cpython_marshal" reader — one format for frozen modules, .pyc files, and the Python-level marshal module. - ComparisonOperator: `(cmp_index << 5) | mask` matching COMPARE_OP - MakeFunctionFlag: bit-position matching SET_FUNCTION_ATTRIBUTE - Exception table varint: big-endian (matching Python/assemble.c) - Linetable varint: little-endian (unchanged) - Integer: TYPE_INT (i32) / TYPE_LONG (base-2^15 digits) - Code objects: CPython field order (argcount, posonlyargcount, ..., co_localsplusnames, co_localspluskinds, ..., co_exceptiontable) - FLAG_REF / TYPE_REF for object deduplication (version >= 3) - allow_code keyword argument on dumps/loads/dump/load - Subclass rejection (int/float/complex/tuple/list/dict/set/frozenset) - Slice serialization (version >= 5) - Buffer protocol fallback for memoryview/array - Recursion depth limit (2000) for both reads and writes - Streaming load (reads one object, seeks file position) - TYPE_INT64, TYPE_FLOAT (text), TYPE_COMPLEX (text) for compat serialize_code writes co_localsplusnames/co_localspluskinds from split varnames/cellvars/freevars. deserialize_code splits them back. Cell variable DEREF indices are translated between flat (wire) and cell-relative (internal) representations in both directions. Replace bitwise trick with match for new ComparisonOperator values. 21 -> 3 expected failures. Remaining: test_bad_reader (IO layer), test_deterministic_sets (PYTHONHASHSEED), testIntern (string interning).
- Use original localspluskinds from marshal data instead of rebuilding, preserving CO_FAST_HIDDEN and other flags - Fix write_varint_be to handle values >= 2^30 (add 6th chunk) - Remove unused build_localspluskinds_from_split
Copilot AI pushed a commit that referenced this pull request
* CPython-compatible marshal format Unify marshal to a single CPython-compatible format. No separate "cpython_marshal" reader — one format for frozen modules, .pyc files, and the Python-level marshal module. - ComparisonOperator: `(cmp_index << 5) | mask` matching COMPARE_OP - MakeFunctionFlag: bit-position matching SET_FUNCTION_ATTRIBUTE - Exception table varint: big-endian (matching Python/assemble.c) - Linetable varint: little-endian (unchanged) - Integer: TYPE_INT (i32) / TYPE_LONG (base-2^15 digits) - Code objects: CPython field order (argcount, posonlyargcount, ..., co_localsplusnames, co_localspluskinds, ..., co_exceptiontable) - FLAG_REF / TYPE_REF for object deduplication (version >= 3) - allow_code keyword argument on dumps/loads/dump/load - Subclass rejection (int/float/complex/tuple/list/dict/set/frozenset) - Slice serialization (version >= 5) - Buffer protocol fallback for memoryview/array - Recursion depth limit (2000) for both reads and writes - Streaming load (reads one object, seeks file position) - TYPE_INT64, TYPE_FLOAT (text), TYPE_COMPLEX (text) for compat serialize_code writes co_localsplusnames/co_localspluskinds from split varnames/cellvars/freevars. deserialize_code splits them back. Cell variable DEREF indices are translated between flat (wire) and cell-relative (internal) representations in both directions. Replace bitwise trick with match for new ComparisonOperator values. 21 -> 3 expected failures. Remaining: test_bad_reader (IO layer), test_deterministic_sets (PYTHONHASHSEED), testIntern (string interning). * Address code review: preserve CO_FAST_HIDDEN, fix varint overflow - Use original localspluskinds from marshal data instead of rebuilding, preserving CO_FAST_HIDDEN and other flags - Fix write_varint_be to handle values >= 2^30 (add 6th chunk) - Remove unused build_localspluskinds_from_split * Add depth guard to deserialize_value_typed Prevents usize underflow when dict key deserialization path calls deserialize_value_typed with depth=0 on composite types.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters