bpo-17611: Move unwinding of stack from interpreter to compiler#4682
bpo-17611: Move unwinding of stack from interpreter to compiler#4682nascheme wants to merge 34 commits into
Conversation
|
+1 Overall, this looks like a net win. |
Sorry, something went wrong.
ncoghlan
left a comment
There was a problem hiding this comment.
Reviewing the eval loop changes, my main comment is that we can't safely make assertions about the expected eval stack state when a particular bytecode is executed, since synthetically constructed bytecode may fail to meet those expectations in weird and exciting ways.
Instead, we should use explicit expectation checks, and raise SystemError when they're not met.
(Note: I haven't reviewed the compiler changes yet)
Sorry, something went wrong.
|
When you're done making the requested changes, leave the comment: |
Sorry, something went wrong.
Behavior is still the same after the bytecode and compiler changes.
The bytecode changes cause the number of expected returns to go from 6 to 5. The last return in the function now gets eliminated, as it is unreachable. I believe the SETUP_LOOP opcode was defeating the peephole optimizer previously. That's just a guess though, I don't understand how the peephole optimizer works.
|
I did a fast benchmark using pyperformance. Results are here. Roughly 2% faster so I would conclude that performance did not get worse. |
Sorry, something went wrong.
We have other assert() checks already, so a few more are fine for now
I wouldn't say "a lot", since bytecode parsing isn't the most complex part of a Python JIT, but I definitely think this is a welcome change (fixing Numba's handling of bytecode stack effect took a bit too much time for my taste :-)). |
Sorry, something went wrong.
|
See also other continuation of Antoine's PR: https://github.com/serhiy-storchaka/cpython/commits/unwind_stack . I'm working on slightly different approach to this issue (not published yet). |
Sorry, something went wrong.
|
This following function (simplified from gevent threadpool.py) causes the compiler to crash:: I'm investigating. Seems to infinite recursion in compiler_unwind_fblock(). |
Sorry, something went wrong.
|
I have an even simpler example:: The fb_unwind function for the HANDLER_CLEANUP fblocktype seems to be buggy. |
Sorry, something went wrong.
The argument is not actually used for anything but it makes more sense to pass 'final' rather than passing 'body'.
If the finally body contains an exit (break or return), don't unwind the finally body a second time. That would cause an infinite loop in the compiler. We temporarily pop the top block, emit code for the finally clause and then push the block back on. At some (hopefully correct) comments explaining what's going on.
|
The fblock unwind logic in finally bodies is still buggy. It unwinds all blocks, not just the blocks enclosing the finally body. I think the fix is not too difficult: allocate fblockinfo on the C stack, have an enclosing block pointer in the fblockinfo struct. When unwind happens, walk the enclosing block pointers. fblock_unwind_finally_try() will have to set the current fblockinfo in the 'c' struct so that unwinds coming from VISIT_SEQ() will unwind from the correct place. As I mentioned in the issue tracker, I think the finally body code duplication is the correct approach. I spent a good part of Sunday studying the patch. The ceval/bytecode changes look pretty solid. The compiler needs some fixing yet. |
Sorry, something went wrong.
When there is a 'return' inside the finally body of a try/finally, we need to clear the current exception. If the final body was entered due to an exception, we also need to pop the EXCEPT_HANDLER fbock.
|
All my tests are passing now. Still could use some polish before merging. |
Sorry, something went wrong.
This is slightly more efficient. For computing the stack effect of opcodes, consider PUSH_NO_EXCEPT to use six stack slots so that the stack will be large enough to handle an exception. Update dis.rst (with content from PR 5006 by Serhiy).
Fixes problem pointed out by Serhiy:
There is a gap between calling the __enter__ method and the
SETUP_FINALLY instruction. If the exception is raised in the gap,
the __exit__ method will be never called. For example:
a = []
with CM() as a[0]: # IndexError
...
|
This PR still has issues with frame.set_lineno() and they don't appear to be trivial to fix. I'm closing the PR. Mark Shannon, who was the original author of this patch, has expressed interested in trying to resolve the issues. So, I think it best to leave it to him. |
Sorry, something went wrong.
This is a re-base and slightly more polished version of PR #2827. I think this change is a good idea as it reduces the size of the stacked used by Python frames.
Also, it makes the bytecode have an "ahead of time" computable effect on the stack. See the changes in PyCompile_OpcodeStackEffect(). E.g. before WITH_CLEANUP_FINISH had a "XXX" comment saying it dropped at least one stack item but at runtime maybe more. After this change, WITH_CLEANUP_FINISH always drops 6 items.
I understand that is important for tools that try to JIT compile Python bytecode. Having an ahead of time computable stack effect makes the JIT a lot simpler.
https://bugs.python.org/issue17611