Issue 43563: Use dedicated opcodes to speed up calls/attribute lookups with super() as receiver
Created on 2021-03-19 20:57 by v2m, last changed 2022-04-11 14:59 by admin.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 24936 | open | v2m, 2021-03-19 20:59 | |
| Messages (7) | |||
|---|---|---|---|
| msg389114 - (view) | Author: Vladimir Matveev (v2m) * | Date: 2021-03-19 20:57 | |
Calling methods and lookup up attributes when receiver is `super()` has extra cost comparing to regular attribute lookup. It mainly comes from the need to allocate and initialize the instance of the `super` which for zero argument case also include peeking into frame/code object for the `__class__` cell and first argument. In addition because `PySuper_Type` has custom implementation of tp_getattro - `_PyObject_GetMethod` would always return bound method.
```
import timeit
setup = """
class A:
def f(self): pass
class B(A):
def f(self): super().f()
def g(self): A.f(self)
b = B()
"""
print(timeit.timeit("b.f()", setup=setup, number=20000000))
print(timeit.timeit("b.g()", setup=setup, number=20000000))
7.329449548968114
3.892987059080042
```
One option to improve it could be to make compiler/interpreter aware of super calls so they can be treated specially. Attached patch introduces two new opcodes LOAD_METHOD_SUPER and LOAD_ATTR_SUPER that are intended to be counterparts for LOAD_METHOD and LOAD_ATTR for cases when receiver is super with either zero or two arguments.
Immediate argument for both LOAD_METHOD_SUPER and LOAD_ATTR_SUPER is a pair that consist of:
0: index of method/attribute in co_names
1: Py_True if super was originally called with 0 arguments and Py_False otherwise.
Both LOAD_METHOD_SUPER and LOAD_ATTR_SUPER expect 3 elements on the stack:
TOS3: global_super
TOS2: type
TOS1: self/cls
Result of LOAD_METHOD_SUPER is the same as LOAD_METHOD.
Result of LOAD_ATTR_SUPER is the same as LOAD_ATTR
In runtime both LOAD_METHOD_SUPER and LOAD_ATTR_SUPER will check if `global_super` is `PySuper_Type` to handle situations when `super` is patched. If `global_super` is `PySuper_Type` then it can use dedicated routine to perform the lookup for provided `__class__` and `cls/self` without allocating new `super` instance. If `global_super` is different from `PySuper_Type` then runtime will fallback to the original logic using `global_super` and original number of arguments that was captured in immediate.
Benchmark results with patch:
4.381768501014449
3.9492998640052974
|
|||
| msg389156 - (view) | Author: Mark Shannon (Mark.Shannon) * ![]() |
Date: 2021-03-20 12:21 | |
Why? Do you have any evidence that the overhead of super() is significant in real programs, or that the proposed change actually speeds up anything beyond your micro-benchmark? |
|||
| msg389168 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2021-03-20 16:06 | |
Currently, super() is decoupled from the core language. It is just a builtin that provides customized attribute lookup. This PR makes super() more tightly integrated with the core language, treating it as if it were a keyword and part of the grammar. Also note, users can currently create their own versions of super(), shadowing the builtin super(). |
|||
| msg389337 - (view) | Author: Vladimir Matveev (v2m) * | Date: 2021-03-22 19:25 | |
>Currently, super() is decoupled from the core language. It is just a builtin that provides customized attribute lookup. This PR makes super() more tightly integrated with the core language, treating it as if it were a keyword and part of the grammar. Also note, users can currently create their own versions of super(), shadowing the builtin super(). This is true however: - this patch does not block people from introducing custom version of `super` so this scenario still work. The idea was to streamline the common case - based on digging into Instagram codebase and its transitive dependencies (which is reasonably large amount of code) all spots where `super()` appear in sources assume `super` to be builtin and for a pretty common use-case its cost is noticeable in profiler. - zero-argument `super()` still a bit magical since it requires compiler support to create cell for `__class__` and assumes certain shape of the frame object so this patch is a step forward with a better compiler support and removing runtime dependency on the frame > Do you have any evidence that the overhead of super() is significant in real programs I do see the non-negligible cost of allocation/initialization of `super` object in IG profiling data. |
|||
| msg389340 - (view) | Author: Guido van Rossum (gvanrossum) * ![]() |
Date: 2021-03-22 20:49 | |
This looks like a sensible idea to me. The safeguards to ensure that customized 'super' still works seem reasonable to me. I have to admit that I sometimes refrain from using super() where I should because of the expense, so this would be welcome. I do wonder -- is two-arg super() important enough to support it at all? Maybe the code would be somewhat simpler if the special opcodes were only generated for zero-arg super() calls. |
|||
| msg389344 - (view) | Author: Mark Shannon (Mark.Shannon) * ![]() |
Date: 2021-03-22 21:27 | |
Numbers please. What is "non-negligible cost of allocation/initialization" mean as a fraction of runtime? What sort of speed up are you seeing on whole programs? |
|||
| msg392116 - (view) | Author: Vladimir Matveev (v2m) * | Date: 2021-04-27 18:23 | |
Apologies for the delay in reply: in more concrete numbers for IG codebase enabling this optimization resulted in 0.2% CPU win. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:59:43 | admin | set | github: 87729 |
| 2021-04-27 18:23:03 | v2m | set | messages: + msg392116 |
| 2021-03-22 21:27:59 | Mark.Shannon | set | messages: + msg389344 |
| 2021-03-22 20:49:58 | gvanrossum | set | messages: + msg389340 |
| 2021-03-22 19:26:00 | v2m | set | messages: + msg389337 |
| 2021-03-22 16:59:15 | gvanrossum | set | nosy:
+ gvanrossum |
| 2021-03-20 16:06:01 | rhettinger | set | nosy:
+ rhettinger messages: + msg389168 |
| 2021-03-20 12:21:52 | Mark.Shannon | set | nosy:
+ Mark.Shannon messages: + msg389156 |
| 2021-03-19 21:10:20 | BTaskaya | set | nosy:
+ BTaskaya |
| 2021-03-19 20:59:46 | v2m | set | keywords:
+ patch stage: patch review pull_requests: + pull_request23696 |
| 2021-03-19 20:57:57 | v2m | create | |
