GH-96421: Don't keep local copies of `first_instr`, `names` and `consts` in the interpreter. by markshannon · Pull Request #96847 · python/cpython
I'm struggling to get good performance numbers on this. An earlier run gave me a 4% speedup, which is clearly wrong.
However, we can look at the stats and deduce that this should speed things up.
first_instr is never hot and can be computed cheaply as frame->f_code + OFFSET. Caching it is pointless.
The reasoning for names and consts is a bit more complex:
- Approx 7% of instructions change the frame, this PR saves four loads (and if
namesandconstsare spllled, two stores) - About 2% of instructions use
names, this PR costs two loads, or one load ifnamesis spilled. - About 8% of instructions use
consts, this PR costs two loads, or one load ifconstsis spilled.
The saving is small if names and consts are not spilled, but a compiler would be stupid not to spill names and consts on x86-64.
So on RISC would expect a very small speedup, but on x86-64 we would expect a larger one. I would estimate in the 0.3-0.6% range.
Even if this results in a small slowdown, it should speedup #96319 and increase the potential benefit of adding LOAD_CONST_IMMORTAL
Skipping news as this change is undetectable, even for a intrusive C extension.
Chesterton's Fence, a.k.a why were these values stored in local variables before?
These variables used to speed up CPython, even they no longer do:
first_instr: Many jumps were absolute, sofirst_instrwas needed to make these jumps.names:LOAD_ATTRandLOAD_GLOBALneednames. Thanks to PEP 659, these instructions (and their adaptive forms) are much less commonconsts: The existence ofPOP_JUMP_IF_NONEreduced the number ofLOAD_CONSTs a bit, andRETURN_GENERATORhas increased the number of frame entry and exits. This may have tipped the balance, or maybe caching ofconstswas misguided before.