gh-149816: Fix UAF in Modules/_pickle.c by alexkats · Pull Request #150024 · python/cpython
Get a strong reference atomically for list item instead of 2 operations.
Original description of the problem from 91.md:
Vulnerability #91
Title: Racy list item borrow causes UAF
Category: Memory Safety Violations
Tags: write,race,env,dos
CWEs: CWE-416, CWE-367
CVSS: CVSS:4.0/AV:L/AC:L/AT:P/PR:L/UI:N/VC:L/VI:L/VA:H/SC:N/SI:N/SA:N
Severity: Medium (5.8)
Location: cpython/Modules/_pickle.c:3213:3214 in function batch_list_exact
Description
batch_list_exact reads list elements using PyList_GET_ITEM and only then increments the reference count (Py_INCREF) without holding the list lock or using a safe strong-ref getter (cpython/Modules/_pickle.c:3213-3214, also cpython/Modules/_pickle.c:3197-3198). In free-threaded mode, _pickle runs without the GIL (cpython/Modules/_pickle.c:8242), so concurrent list mutation is possible while pickling. A mutator thread can remove/replace the same element and decref it to zero (cpython/Objects/listobject.c:1145-1156) between size check/access in batch_list_exact (cpython/Modules/_pickle.c:3212-3214), leading to Py_INCREF on freed memory (use-after-free write).
Trigger Conditions
Pre-conditions:
- CPython is built/run in free-threaded mode (no global interpreter lock).
- The C accelerator
_pickleis used (it declares no-GIL operation atcpython/Modules/_pickle.c:8242). - Target object is an exact
listand protocol is greater than 0, sosave_listusesbatch_list_exact(cpython/Modules/_pickle.c:3266-3269). - The same list object is shared across threads.
Data flow:
- Thread A calls
pickle.dumps(shared_list, protocol=5);save_listdispatches tobatch_list_exact(cpython/Modules/_pickle.c:3266-3269). - In
batch_list_exact, Thread A evaluates loop/size and then fetches a borrowed element viaPyList_GET_ITEM(cpython/Modules/_pickle.c:3212-3213). - Concurrently, Thread B mutates that list index (e.g.,
del shared_list[i]orshared_list[i] = other), and list code decrefs the old element (cpython/Objects/listobject.c:1145-1156), potentially freeing it. - Thread A performs
Py_INCREF(item)on the stale pointer (cpython/Modules/_pickle.c:3214), triggering UAF memory corruption/crash.
Impact
This can cause native memory corruption in the interpreter (use-after-free with refcount write), typically leading to process crash (denial of service) and potentially enabling arbitrary code execution in worst case under favorable heap/layout conditions. Exploitability is constrained by requiring a free-threaded build and a concurrent mutation race on the same list object, but the code path is reachable from normal Python APIs (pickle.dumps) and does not enforce safety against such concurrent access.
Remediation
- In
batch_list_exact()(cpython/Modules/_pickle.c), replace both unsafe borrowed-ref reads (PyList_GET_ITEM+Py_INCREF) with a strong-reference API (PyList_GetItemRef/ internal equivalent) so element lifetime is acquired atomically for free-threaded builds. - Remove dependence on repeatedly reading live list size inside the loop; instead iterate against a stable expected length captured in
save_list()and passed intobatch_list_exact(). - Add explicit mutation detection/error handling in the exact-list fast path (similar to dict/set batching): if size/index validity changes during traversal, raise a deterministic
RuntimeError(e.g., “list changed size during iteration”) instead of continuing. - Add/extend free-threaded regression tests for concurrent
pickle.dumps()+ list mutation to ensure no UAF/crash, and verify behavior is clean exception-only under race.
Reproduction
- Build a free-threaded CPython (the no-GIL configuration) with debug aids enabled if possible (ASAN/UBSAN build, or at least
--with-pydebug). - In that interpreter, create a test scenario with a shared exact
listused by two concurrent threads:- one thread repeatedly pickles the same list with protocol
> 0(so_pickletakes the fast exact-list path), - the other thread continuously mutates that same list (delete/replace items, especially around low indexes).
- one thread repeatedly pickles the same list with protocol
- Make the list elements objects with short lifetimes (frequent replacement with fresh temporary objects) and run both threads in tight loops for many iterations to maximize race frequency.
- Repeat the run a few times if it does not trigger immediately; race timing is non-deterministic. Increasing core count, reducing sleeps, and extending runtime should increase hit rate.
- Confirm the issue by observing any of the following:
- interpreter crash/segfault during pickling,
- debug build fatal error related to refcount/object validity,
- sanitizer report showing a use-after-free or invalid memory access with frames near
_picklelist batching /Py_INCREFon a list item.
Code Context
item = PyList_GET_ITEM(obj, total);
Py_INCREF(item);