fix: avoid memory leak when decoding invalid nested arrays#671
Conversation
9f24141 to
59fe6b8
Compare
April 29, 2026 09:23
|
Seems like Windows GHA runners are having a rough day... |
Sorry, something went wrong.
Don't mind. Its known problem. Python 3.14t on Windows arm64 is broken at the moment. |
Sorry, something went wrong.
There was a problem hiding this comment.
Pull request overview
Fixes a ref-leak in the C-backed unpacker when decoding invalid data that errors out after creating nested container objects, ensuring intermediate stack containers get freed instead of being lost.
Changes:
- Update
unpack_clear()to clear all live container objects on the unpacker stack (and any pendingmap_keyreference when waiting for a map value). - Ensure
Unpackerreleases any retained unpacking stack objects during destruction by callingunpack_clear()in__dealloc__.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| msgpack/unpack_template.h | Frees all live stack frames (and pending map key refs) in unpack_clear() to prevent leaks on invalid-input error paths. |
| msgpack/_unpacker.pyx | Calls unpack_clear() during Unpacker deallocation to release any partially-decoded objects retained in the context. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Sorry, something went wrong.
59fe6b8 to
ff1e3ee
Compare
April 29, 2026 19:29
ff1e3ee to
1b54a1e
Compare
April 29, 2026 19:30
4a07745
into
msgpack:main
May 27, 2026
What is this PR?
This PR fixes a memory leak that was detected (accidentally) through fuzzing.
The leak happens in some cases when trying to decode invalid data. When decoding an array, the unpacker uses a stack and pushes a new list to that stack for every nested array element. If it eventually reaches a point where the element to decode is problematic, it returns
-2which results inFormatErrorbeing raised, butunpack_clearlacks some memory freeing logic (only relevant in the problematic case -- the outermost list is freed but not the other ones) and the objects are lost forever.The leak can directly be observed by running the following reproducer, which pushes several nested lists, and eventually one undefined format byte. It results in returning an error, without freeing the intermediate list objects. The logic is run under
tracemalloc, with explicit GC calls to avoid transient still-alive objects that would be freed at some point. The deeper the nesting, the more objects are leaked.