◐ Shell
clean mode source ↗

bpo-45565: Specialize LOAD_ATTR_CLASS by Fidget-Spinner · Pull Request #29146 · python/cpython

Do you have stats for the standard benchmark suite (or something of similar scale), and have you benchmarked this?

No, I'll get some soon (pyperformance is a pain on Windows :().

  • For the stats you give, the hits increase by ~10k and the misses increase by ~2k. The cost of a miss can be higher than the benefit of a hit so this might not be a win. Is this an artifact of the test scripts? Would you expect better numbers on "real" programs?

The only possible way for deopt is for tp_version_tag to change, and that requires the class variable to be written to. So things like:

Unfortunately, I have no clue how common something like this is in the real world. An alternative approach (with far fewer invalidations):

  1. Store owner.tp_mro tuple ID.
  2. Store the index of the real type we need to look into and where it belongs in owner.tp_mro.
  3. Store dict hint of real type.__dict__ and dk version..

At runtime, look into owner.tp_mro[mro_index].__dict__ + hint to get our attribute.

The benefit is that tp_version_tag invalidates every time a write occurs, but we don't care about that since the actual index in the dict doesn't change.

  • The cache use for LOAD_ATTR has increased from 2 to 3. Only 4 bytes of each of the first two cache entries are being used, so this could be reduced back to 2 entries.

Indeed, I failed to see that _PyAdaptiveEntry still had 4 bytes of unused space, so we can pack tp_version_tag into there if we go with the old approach.