bpo-30039: Don't run signal handlers while resuming a yield from stack by njsmith · Pull Request #1081 · python/cpython

the-knights-who-say-ni

1st1 approved these changes Apr 20, 2017

If we have a chain of generators/coroutines that are 'yield from'ing
each other, then resuming the stack works like:

- call send() on the outermost generator
- this enters _PyEval_EvalFrameDefault, which re-executes the
  YIELD_FROM opcode
- which calls send() on the next generator
- which enters _PyEval_EvalFrameDefault, which re-executes the
  YIELD_FROM opcode
- ...etc.

However, every time we enter _PyEval_EvalFrameDefault, the first thing
we do is to check for pending signals, and if there are any then we
run the signal handler. And if it raises an exception, then we
immediately propagate that exception *instead* of starting to execute
bytecode. This means that e.g. a SIGINT at the wrong moment can "break
the chain" – it can be raised in the middle of our yield from chain,
with the bottom part of the stack abandoned for the garbage collector.

The fix is pretty simple: there's already a special case in
_PyEval_EvalFrameEx where it skips running signal handlers if the next
opcode is SETUP_FINALLY. (I don't see how this accomplishes anything
useful, but that's another story.) If we extend this check to also
skip running signal handlers when the next opcode is YIELD_FROM, then
that closes the hole – now the exception can only be raised at the
innermost stack frame.

This shouldn't have any performance implications, because the opcode
check happens inside the "slow path" after we've already determined
that there's a pending signal or something similar for us to process;
the vast majority of the time this isn't true and the new check
doesn't run at all.

The included test fails before this patch, but passes afterwards.

1st1 and others added 2 commits

May 1, 2017 14:08

1st1 approved these changes May 17, 2017

1st1 pushed a commit that referenced this pull request

May 17, 2017

…m stack (GH-1081)

If we have a chain of generators/coroutines that are 'yield from'ing
each other, then resuming the stack works like:

- call send() on the outermost generator
- this enters _PyEval_EvalFrameDefault, which re-executes the
  YIELD_FROM opcode
- which calls send() on the next generator
- which enters _PyEval_EvalFrameDefault, which re-executes the
  YIELD_FROM opcode
- ...etc.

However, every time we enter _PyEval_EvalFrameDefault, the first thing
we do is to check for pending signals, and if there are any then we
run the signal handler. And if it raises an exception, then we
immediately propagate that exception *instead* of starting to execute
bytecode. This means that e.g. a SIGINT at the wrong moment can "break
the chain" – it can be raised in the middle of our yield from chain,
with the bottom part of the stack abandoned for the garbage collector.

The fix is pretty simple: there's already a special case in
_PyEval_EvalFrameEx where it skips running signal handlers if the next
opcode is SETUP_FINALLY. (I don't see how this accomplishes anything
useful, but that's another story.) If we extend this check to also
skip running signal handlers when the next opcode is YIELD_FROM, then
that closes the hole – now the exception can only be raised at the
innermost stack frame.

This shouldn't have any performance implications, because the opcode
check happens inside the "slow path" after we've already determined
that there's a pending signal or something similar for us to process;
the vast majority of the time this isn't true and the new check
doesn't run at all..
(cherry picked from commit ab4413a)

1st1 mentioned this pull request

May 17, 2017

1st1 pushed a commit that referenced this pull request

Jun 9, 2017

…m stack (GH-1081)

If we have a chain of generators/coroutines that are 'yield from'ing
each other, then resuming the stack works like:

- call send() on the outermost generator
- this enters _PyEval_EvalFrameDefault, which re-executes the
  YIELD_FROM opcode
- which calls send() on the next generator
- which enters _PyEval_EvalFrameDefault, which re-executes the
  YIELD_FROM opcode
- ...etc.

However, every time we enter _PyEval_EvalFrameDefault, the first thing
we do is to check for pending signals, and if there are any then we
run the signal handler. And if it raises an exception, then we
immediately propagate that exception *instead* of starting to execute
bytecode. This means that e.g. a SIGINT at the wrong moment can "break
the chain" – it can be raised in the middle of our yield from chain,
with the bottom part of the stack abandoned for the garbage collector.

The fix is pretty simple: there's already a special case in
_PyEval_EvalFrameEx where it skips running signal handlers if the next
opcode is SETUP_FINALLY. (I don't see how this accomplishes anything
useful, but that's another story.) If we extend this check to also
skip running signal handlers when the next opcode is YIELD_FROM, then
that closes the hole – now the exception can only be raised at the
innermost stack frame.

This shouldn't have any performance implications, because the opcode
check happens inside the "slow path" after we've already determined
that there's a pending signal or something similar for us to process;
the vast majority of the time this isn't true and the new check
doesn't run at all..
(cherry picked from commit ab4413a)

1st1 pushed a commit that referenced this pull request

Jun 9, 2017

…m stack (GH-1081) (#1640)

If we have a chain of generators/coroutines that are 'yield from'ing
each other, then resuming the stack works like:

- call send() on the outermost generator
- this enters _PyEval_EvalFrameDefault, which re-executes the
  YIELD_FROM opcode
- which calls send() on the next generator
- which enters _PyEval_EvalFrameDefault, which re-executes the
  YIELD_FROM opcode
- ...etc.

However, every time we enter _PyEval_EvalFrameDefault, the first thing
we do is to check for pending signals, and if there are any then we
run the signal handler. And if it raises an exception, then we
immediately propagate that exception *instead* of starting to execute
bytecode. This means that e.g. a SIGINT at the wrong moment can "break
the chain" – it can be raised in the middle of our yield from chain,
with the bottom part of the stack abandoned for the garbage collector.

The fix is pretty simple: there's already a special case in
_PyEval_EvalFrameEx where it skips running signal handlers if the next
opcode is SETUP_FINALLY. (I don't see how this accomplishes anything
useful, but that's another story.) If we extend this check to also
skip running signal handlers when the next opcode is YIELD_FROM, then
that closes the hole – now the exception can only be raised at the
innermost stack frame.

This shouldn't have any performance implications, because the opcode
check happens inside the "slow path" after we've already determined
that there's a pending signal or something similar for us to process;
the vast majority of the time this isn't true and the new check
doesn't run at all..
(cherry picked from commit ab4413a)

ma8ma added a commit to ma8ma/cpython that referenced this pull request

Jun 13, 2017

Resolve conflcts:
ab4413a bpo-30039: Don't run signal handlers while resuming a yield from stack (python#1081)