GH-91048: Add utils for capturing async call stack for asyncio programs and enable profiling#124640
GH-91048: Add utils for capturing async call stack for asyncio programs and enable profiling#124640ambv merged 88 commits into
Conversation
|
I've implemented proof of concept out of process async stack reconstruction here: 1st1@7185f8d |
Sorry, something went wrong.
760ad58 to
4f7bf44
Compare
September 30, 2024 11:28
I haven't measured, but I don't expect the overhead to be detectable at all. In the most common scenario the tracking is just assigning / resetting a pointer in Task/Future C structs. |
Sorry, something went wrong.
I ran some async benchmarks from pyperformance and a small echo tcp server and the Perf impact is below the noise. |
Sorry, something went wrong.
`f_generator` returns the generator / coroutine / async generator object that owns the frame. For all other kinds of frames it will return `None`. This is useful to reconstruct call stack for async/await code.
|
There have been some changes in asyncio which are relevant for this:
I haven't followed on this PR so not sure how many of those points matter here, but still I wrote it incase anyone else missed them. |
Sorry, something went wrong.
Then we should find some solution that doesn't break the external introspection in the future ;) |
Sorry, something went wrong.
This is slightly tricky and if you recall we removed the locks we added at your request. This PR is already unwieldy complicated so I would prefer if we work together in a separate PR to ensure the lock safety is addressed as you would like it |
Sorry, something went wrong.
Yes, I do remember but it was long time ago 2-3 months, I have made significant changes related to free-threading since that time. It's fine by me to work on free-threading later and merge this first "as-is". |
Sorry, something went wrong.
|
Thank you, Kumar. I landed this feature. We've received a ton of review here, definitely more than most changes. We addressed all feedback on design, correctness, and performance. The long-lived branch is hard to keep alive for longer. Thank you to everyone involved, let's evolve this forward from the main branch. We'll help with #128869 and future free-threading compatibility to make sure external introspection keeps working. As part of the free-threading compatibility of this particular feature, we'll address this in a subsequent PR soon, where we will also add support for eager task external introspection. The new tests added by Kumar in January show some interesting behavior of eager tasks, I'm looking into that. Will provide more information on the subsequent PR. |
Sorry, something went wrong.
…r tasks This was missing from pythongh-124640. It's already covered by the new test_asyncio/test_free_threading.py in combination with the runtime assertion in set_ts_asyncio_running_task.
|
Not sure if this is directly related, but we're seeing build bot failures now: https://buildbot.python.org/#/builders/125/builds/6894 |
Sorry, something went wrong.
On it |
Sorry, something went wrong.
|
@pablogsal @ambv, this broke several tier-1 buildbots for more than a day so, by the book, it should be reverted. But, I also know there are issues with a buildbot update currently, and reverting & reapplying would be a lot churn in this case. I'll leave it to you to handle this one. FWIW, each working day I check there are no other buildbot failures, but I'll be offline during the weekend. |
Sorry, something went wrong.
This PR introduces low overhead utils to enable async stack reconstruction for running and suspended tasks and futures. The runtime overhead this PR adds is making tasks set & reset a reference to future/tasks they await on. Control flow primitives like
asyncio.TaskGroup,asyncio.gather(),asyncio.shield()are also updated to perform this simple bookkeeping.Remaining to-do for @pablogsal, @ambv, and myself
swap_current_taskinasynciomodule.cneeds to be updated to take care of the newts->asyncio_running_tasktest_running_loop_within_a_loopseg faults if the test is ran twice (I think), a repro is running./python.exe -m test test_asyncio -R3:3-R3:3reports refleaks, might be nothing, but needs to be checked(OK, definitely looks like there's a leak "test_asyncio.test_unix_events leaked [492, 492, 492] references, sum=1476", repro./python.exe -m test test_asyncio.test_subprocess test_asyncio.test_taskgroups -R3:3)gather()New APIs:
asyncio.capture_call_graph(*, future=None, depth=1)to capture the call stack for the current task or for the specified task or futureasyncio.print_call_graph(*, future=None, file=None, depth=1)to print the call stack for the current task or for the specified task or futureasyncio.future_add_to_awaited_by()andasyncio.future_discard_from_awaited_by()to enable "stitching" tasks and futures that are awaiting other tasks and futures.frame.f_generatora getter to return the generator object associated with the frame orNone. The implementation is maximally straightforward and does not require any new fields in any of the internal interpreter structs.New C APIs:
Coroutine and generator structs gain a new pointercr_task. It is a borrowed pointer to the task that runs the coroutine. This is meant for external profilers to quickly find the associated task (an alternative to this would be a rather costly traversal of interpreter state to find the module state of asyncio, requiring to read many Pythondictstructs).A "private" C API functionvoid _PyCoro_SetTask(PyObject *coro, PyObject *task)to set thecr_taskfield from_asynciomodule.c.We have a complete functional test for out of process introspection in
test_external_inspection.py(sampling profilers and debuggers will follow the same approach).Example
This example program:
will print:
📚 Documentation preview 📚: https://cpython-previews--124640.org.readthedocs.build/