Fix stack overflow on deeply-nested JSON in json.loads() by changjoon-park · Pull Request #7632 · RustPython/RustPython
json.loads() on a deeply-nested array or object payload (e.g.
'[' * 50000 + ']' * 50000) overflowed the native Rust stack and
crashed the interpreter process with SIGSEGV. CPython raises
RecursionError on the same input via _Py_EnterRecursiveCall in
Modules/_json.c.
The recursion lives in the mutual call chain:
JsonScanner::parse_object / parse_array
-> JsonScanner::call_scan_once
-> JsonScanner::parse_object / parse_array
Every descent funnels through call_scan_once, so wrapping its body
with vm.with_recursion covers both '{' and '[' paths (and their
mixed nesting) with a single guard.
Before:
./rustpython -c "import json; json.loads('[' * 50000 + ']' * 50000)"
-> SIGSEGV (exit 139)
After:
-> RecursionError: maximum recursion depth exceeded while
decoding a JSON object from a string
Verified:
- extra_tests/snippets/stdlib_json.py: all assertions pass
(includes 3 new regression cases: array, object, alternating
nesting at depth 100000)
- cargo run -- -m test test_json: 214 passed, 0 regressed
(9 skipped, 13 expected failures, all pre-existing)
- depth 500000 no longer crashes (RecursionError)
- shallow parsing unchanged
Per @ShaharNaveh's review on RustPython#7632: this test was previously marked `@unittest.skip("TODO: RUSTPYTHON; crashes")` because json.loads would SIGSEGV on the 500_000-deep input. The recursion-guard added in this PR makes it raise RecursionError like CPython, so the skip decorator can be removed. $ cargo run -- -m unittest \ test.test_json.test_recursion.TestCRecursion.test_highly_nested_objects_decoding \ test.test_json.test_recursion.TestPyRecursion.test_highly_nested_objects_decoding ... Ran 2 tests in 0.825s OK $ cargo run -- -m test test_json Ran 214 tests (7 skipped, 13 expected failures) — all pass.
This was referenced
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters