◐ Shell
reader mode source ↗
Skip to content

bpo-17611: Move unwinding of stack from interpreter to compiler#4682

Closed
nascheme wants to merge 34 commits into
python:masterfrom
nascheme:unwind_stack
Closed

bpo-17611: Move unwinding of stack from interpreter to compiler#4682
nascheme wants to merge 34 commits into
python:masterfrom
nascheme:unwind_stack

Conversation

@nascheme

@nascheme nascheme commented Dec 2, 2017

Copy link
Copy Markdown
Member

This is a re-base and slightly more polished version of PR #2827. I think this change is a good idea as it reduces the size of the stacked used by Python frames.

Also, it makes the bytecode have an "ahead of time" computable effect on the stack. See the changes in PyCompile_OpcodeStackEffect(). E.g. before WITH_CLEANUP_FINISH had a "XXX" comment saying it dropped at least one stack item but at runtime maybe more. After this change, WITH_CLEANUP_FINISH always drops 6 items.

I understand that is important for tools that try to JIT compile Python bytecode. Having an ahead of time computable stack effect makes the JIT a lot simpler.

https://bugs.python.org/issue17611

@rhettinger

Copy link
Copy Markdown
Contributor

+1 Overall, this looks like a net win.

@ncoghlan ncoghlan left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Reviewing the eval loop changes, my main comment is that we can't safely make assertions about the expected eval stack state when a particular bytecode is executed, since synthetically constructed bytecode may fail to meet those expectations in weird and exciting ways.

Instead, we should use explicit expectation checks, and raise SystemError when they're not met.

(Note: I haven't reviewed the compiler changes yet)

@bedevere-bot

Copy link
Copy Markdown

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

12 hidden items Load more…
Behavior is still the same after the bytecode and compiler
changes.
The bytecode changes cause the number of expected returns
to go from 6 to 5.  The last return in the function now gets
eliminated, as it is unreachable.  I believe the SETUP_LOOP
opcode was defeating the peephole optimizer previously.  That's just
a guess though, I don't understand how the peephole optimizer works.
@nascheme

nascheme commented Dec 3, 2017

Copy link
Copy Markdown
Member Author

I did a fast benchmark using pyperformance. Results are here. Roughly 2% faster so I would conclude that performance did not get worse.

@ncoghlan ncoghlan dismissed their stale review December 3, 2017 03:28

We have other assert() checks already, so a few more are fine for now

@pitrou

pitrou commented Dec 3, 2017

Copy link
Copy Markdown
Member

Having an ahead of time computable stack effect makes the JIT a lot simpler.

I wouldn't say "a lot", since bytecode parsing isn't the most complex part of a Python JIT, but I definitely think this is a welcome change (fixing Numba's handling of bytecode stack effect took a bit too much time for my taste :-)).

@serhiy-storchaka

Copy link
Copy Markdown
Member

See also other continuation of Antoine's PR: https://github.com/serhiy-storchaka/cpython/commits/unwind_stack .

I'm working on slightly different approach to this issue (not published yet).

@nascheme

nascheme commented Dec 3, 2017

Copy link
Copy Markdown
Member Author

This following function (simplified from gevent threadpool.py) causes the compiler to crash::

def func():
    try:
	try:
	    func()
	except:
	    return
	finally:
	    pass
    finally:
	return

I'm investigating. Seems to infinite recursion in compiler_unwind_fblock().

@nascheme

nascheme commented Dec 3, 2017

Copy link
Copy Markdown
Member Author

I have an even simpler example::

def func():
    try:
	try:
	    True
	except:
	    return
    finally:
	return

The fb_unwind function for the HANDLER_CLEANUP fblocktype seems to be buggy.
Still trying to figure out how this all works.

The argument is not actually used for anything but it makes
more sense to pass 'final' rather than passing 'body'.
If the finally body contains an exit (break or return), don't unwind
the finally body a second time.  That would cause an infinite loop
in the compiler.  We temporarily pop the top block, emit code for
the finally clause and then push the block back on.

At some (hopefully correct) comments explaining what's going on.
@nascheme

nascheme commented Dec 4, 2017

Copy link
Copy Markdown
Member Author

The fblock unwind logic in finally bodies is still buggy. It unwinds all blocks, not just the blocks enclosing the finally body. I think the fix is not too difficult: allocate fblockinfo on the C stack, have an enclosing block pointer in the fblockinfo struct. When unwind happens, walk the enclosing block pointers. fblock_unwind_finally_try() will have to set the current fblockinfo in the 'c' struct so that unwinds coming from VISIT_SEQ() will unwind from the correct place.

As I mentioned in the issue tracker, I think the finally body code duplication is the correct approach. I spent a good part of Sunday studying the patch. The ceval/bytecode changes look pretty solid. The compiler needs some fixing yet.

When there is a 'return' inside the finally body of a try/finally,
we need to clear the current exception.  If the final body was
entered due to an exception, we also need to pop the EXCEPT_HANDLER
fbock.
@nascheme

Copy link
Copy Markdown
Member Author

All my tests are passing now. Still could use some polish before merging.

This is slightly more efficient.  For computing the stack effect
of opcodes, consider PUSH_NO_EXCEPT to use six stack slots so that
the stack will be large enough to handle an exception.

Update dis.rst (with content from PR 5006 by Serhiy).
Fixes problem pointed out by Serhiy:

There is a gap between calling the __enter__ method and the
SETUP_FINALLY instruction. If the exception is raised in the gap,
the __exit__ method will be never called. For example:

    a = []
    with CM() as a[0]: # IndexError
        ...
@nascheme

nascheme commented Jan 2, 2018

Copy link
Copy Markdown
Member Author

This PR still has issues with frame.set_lineno() and they don't appear to be trivial to fix. I'm closing the PR. Mark Shannon, who was the original author of this patch, has expressed interested in trying to resolve the issues. So, I think it best to leave it to him.

@nascheme nascheme closed this Jan 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants