gh-106529: Make FOR_ITER a viable uop#112134
Conversation
|
|
Sorry, something went wrong.
e909a83 to
b28effa
Compare
November 15, 2023 23:58
|
@brandtbucher Let me know how bad this interferes with the JIT branch. The stuff I put in |
Sorry, something went wrong.
|
Benchmark results: |
Sorry, something went wrong.
So, this doesn't actually look too bad. I think with a tiny bit of reworking, this could be merged into the JIT branch with no pain:
|
Sorry, something went wrong.
|
Okay, I think I managed to do that. I can now merge into the justin branch with only a single trivial merge conflict (we both added something to the end of ceval_macros.h). Next up of course, how would I redefine CURRENT_TARGET() in templace.c? |
Sorry, something went wrong.
|
|
Sorry, something went wrong.
Yup, and I also needed to move the It seems to work, but unsure how to prove it (it compiles and passes tests, but it would too if the JIT was never invoked). Anyway, we should probably wait until we've decided that making Tier 2 5% slower is a good idea. |
Sorry, something went wrong.
|
There seem to be a few unrelated changes in this PR. |
Sorry, something went wrong.
Sure, see gh-112214. I've merged that into main, since the tests pass, next I'll merge it into this PR and do the other thing you requested. |
Sorry, something went wrong.
|
Okay, here's the new version. @markshannon Want to review one more time? |
Sorry, something went wrong.
|
Benchmark says 4% slower (with Tier 2):
|
Sorry, something went wrong.
|
FWIW, I have a branch on top of this that makes the uop interpreter 3% faster. The approach is to let the code generator generate code that extracts |
Sorry, something went wrong.
|
Honestly, as long as it’s done in macros, it shouldn’t be too bad. Macros are a great escape hatch in the short term before we start generating bespoke “JIT cases”. |
Sorry, something went wrong.
So it could generate something like where the default definitions for those (in ceval_macros.h) would be |
Sorry, something went wrong.
|
Yup! |
Sorry, something went wrong.
- Double max trace size to 256 - Add a dependency on executor_cases.c.h for ceval.o - Mark `_SPECIALIZE_UNPACK_SEQUENCE` as `TIER_ONE_ONLY` - Add debug output back showing the optimized trace - Bunch of cleanups to Tools/cases_generator/
This uses the new mechanism whereby certain uops are replaced by others during translation, using the `_PyUop_Replacements` table. We further special-case the `_FOR_ITER_TIER_TWO` uop to update the deoptimization target to point just past the corresponding `END_FOR` opcode. Two tiny code cleanups are also part of this PR.
- Double max trace size to 256 - Add a dependency on executor_cases.c.h for ceval.o - Mark `_SPECIALIZE_UNPACK_SEQUENCE` as `TIER_ONE_ONLY` - Add debug output back showing the optimized trace - Bunch of cleanups to Tools/cases_generator/
This uses the new mechanism whereby certain uops are replaced by others during translation, using the `_PyUop_Replacements` table. We further special-case the `_FOR_ITER_TIER_TWO` uop to update the deoptimization target to point just past the corresponding `END_FOR` opcode. Two tiny code cleanups are also part of this PR.
Also clean up a few nits in the code generator, and add back an important Make dependency for ceval.o.