bpo-44525: Specialize CALL_FUNCTION for C function calls#26934
Conversation
|
Initially I also planned to specialize things like |
Sorry, something went wrong.
|
🤖 New build scheduled with the buildbot fleet by @Fidget-Spinner for commit 0e0a3a4 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
Sorry, something went wrong.
pablogsal
left a comment
There was a problem hiding this comment.
I think this is unfortunate far too complex for the benefit this is giving. I think at the very least this needs some macro benchmarks but if this already is just 20% on a micro benchmark it doesn't look very promising, unfortunately
Sorry, something went wrong.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Sorry, something went wrong.
|
Oh, I just saw the macro benchmarks in https://gist.github.com/Fidget-Spinner/b2f05e0f6b8851d41e09412f5189ce8f. Indeed, it shows almost no improvement. I propose to close this because this is far too complex for no global benefit, and even the micro benchmark is too small in my humble opinion. Maybe other core Devs have another view on this, thought. |
Sorry, something went wrong.
Sorry, something went wrong.
|
I agree with @pablogsal that as is, it is too much of complexity with no real gains. Though if this is most of an infrastructure PR, then I think the optimizations (6 of them?) @Fidget-Spinner spoke of should also be implemented in this as well, and if they show promising returns on macro benchmarks we could re-evaluate the complexity/performance pair. |
Sorry, something went wrong.
|
Notice that this also has a memory cost for caching the pointers, which should also be taken into account when evaluating the benefits. |
Sorry, something went wrong.
|
@pablogsal and @isidentical your concerns are valid and shared by me. I was worried about the code maintenance while writing this PR too. Roughly 1/4th of it is boilerplate infrastructure for adaptive instructions. The entire diff by I'm hesitant to implement the other specializations too as they require a lot more time (and complexity!). So I sent this PR first to get a sanity check on what I'm doing. Ultimately, I'll let the core devs decide on this and whether y'all find it useful. I don't mind too much if this gets rejected -- I already had a lot of fun writing this :). But I'm interested to hear what the others have to say. |
Sorry, something went wrong.
|
Oh I forgot to mention that the pyperformance numbers are out of date. They were taken when I only optimized for In the meantime, I'll explore the other specializations and see if there's a difference. Thanks everyone! |
Sorry, something went wrong.
markshannon
left a comment
There was a problem hiding this comment.
Specialized instructions need to be quite specialized 🙂
Looking forward to seeing benchmark results once the specialization is more refined and #26677 is merged.
Sorry, something went wrong.
markshannon
left a comment
There was a problem hiding this comment.
Looks good. A few minor tweaks needed.
Sorry, something went wrong.
|
Currently, just a little bit faster but that's good as this PR takes the hit of the adaptive machinery for all other calls. |
Sorry, something went wrong.
|
Please ignore a message I posted earlier, I just realized it was a different bytecode altogether 🤦♂️ . |
Sorry, something went wrong.
markshannon
left a comment
There was a problem hiding this comment.
A few minor things I missed earlier
Sorry, something went wrong.
|
🤖 New build scheduled with the buildbot fleet by @markshannon for commit f191720 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
Sorry, something went wrong.
|
Buildbot failures are the usual trio plus a timeout. |
Sorry, something went wrong.
⚠️⚠️⚠️ Buildbot failure ⚠️⚠️⚠️Hi! The buildbot s390x RHEL7 LTO 3.x has failed when building commit 3163e68. What do you need to do:
You can take a look at the buildbot page here: https://buildbot.python.org/all/#builders/402/builds/1002 Failed tests:
Summary of the results of the build (if available): == Click to see traceback logsTraceback (most recent call last):
File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto/build/Lib/multiprocessing/resource_tracker.py", line 209, in main
cache[rtype].remove(name)
^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: '/psm_52734b07'
Traceback (most recent call last):
File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto/build/Lib/multiprocessing/resource_tracker.py", line 209, in main
cache[rtype].remove(name)
^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: '/psm_b71e5584'
Traceback (most recent call last):
File "/home/dje/cpython-buildarea/3.x.edelsohn-rhel-z.lto/build/Lib/multiprocessing/resource_tracker.py", line 209, in main
cache[rtype].remove(name)
^^^^^^^^^^^^^^^^^^^^^^^^^
KeyError: '/psm_d663bdb0'
|
Sorry, something went wrong.
|
This commit has broken thes 390x RHEL7 LTO 3.x buildbot as you can see before. As per the buildbot maintenance procedures, we will need to revert this PR unless is fixed in 24 hours. |
Sorry, something went wrong.
|
@pablogsal indeed it seems this PR is guilty. I'll send a PR to revert this tomorrow. Unfortunately I don't have the bandwidth to investigate this right now. Sorry. |
Sorry, something went wrong.
|
@pablogsal it seems that the buildbot is now green thanks to Dennis' patch #29048. For clarity, the C function optimizations assume all C functions called already do their own recursion checking. This should already be the case for all vectorcall-abiding functions, but it seems that Unfortunately the rule doesn't apply to |
Sorry, something went wrong.
Right, but this is not true in all functions as we saw. Doesn't also need to be done in the function call site? What is the contact exactly here? Because if this means that all the C functions need to do the recursion check on the function body, then the change is backwards incompatible |
Sorry, something went wrong.
The contract is that for vectorcall functions, the callee must do their own recursion checking https://docs.python.org/3/c-api/call.html#recursion-control. For tp_call functions, CPython will check for them.
Right. Like I mentioned above, the only incompatible opcode is |
Sorry, something went wrong.
Fantastic! Thanks for the explanation 👍 |
Sorry, something went wrong.
Initial infrastructure code for specializing
CALL_FUNCTION. Also added specialization for callingMETH_OPyCFunctions because it's the easiest to implement. Along withMETH_FASTCALL.Measured up to 20% faster calls for
METH_Oon microbenchmarks.https://bugs.python.org/issue44525