gh-87729: improve hit rate of LOAD_SUPER_ATTR specialization by carljm · Pull Request #104270 · python/cpython
We observed that in stats collection on pyperformance (+ pyston macro-benchmarks), the hit rate of the LOAD_SUPER_ATTR specialization to LOAD_SUPER_ATTR_METHOD was not as good as expected. Investigation revealed a few causes for this.
-
It is not uncommon to have super method calls that use argument unpacking (e.g.
super().method(*args, **kwargs)) and so need to use e.g.CALL_FUNCTION_EX. This fails the checks for the load-method optimization in the compiler, so it emits aLOAD_SUPER_ATTRwithout the load-method optimization. -
Loading things that are not actually method descriptors via a call that looks like
super().method()is also not uncommon. E.g.super().__new__()where the base class is e.g. builtinstr. -
The type of
selfat a particular call-site ofsuper()is commonly polymorphic, whenever the inheritance chain is longer than length 2. E.g. ifCinheritsBinheritsA, then a call tosuper()in a method ofBmay commonly seeselfof type eitherBorC.
The fixes to (1) and (2) are simple enough: we should also specialize for LOAD_SUPER_ATTR_ATTR.
(3) is a much trickier problem, since it means we can't inline-cache the results of a super() lookup at all. The results of a super() lookup are dependent on the type of self, and we can't effectively specialize for the type of self (barring polymorphic inline caching, which we don't generally do and would make for a much large inline cache.)
This means that specialization of LOAD_SUPER_ATTR is just splitting out the "shadowed global super" case from the load-method and attr cases. This still preserves most of the performance improvement from LOAD_SUPER_ATTR, which always came from avoiding allocation of single-use super objects.
With this change, in my tests the hit rate of LOAD_SUPER_ATTR specialization in pyperformance is 100%. (Effectively this just means we don't have a case in pyperformance of shadowing the name super.)