GH-98831: Simple input-output stack effects by gvanrossum · Pull Request #99120 · python/cpython
| // stack effect: (__0 -- ) | ||
| inst(BINARY_OP_MULTIPLY_INT) { | ||
| instr(BINARY_OP_MULTIPLY_INT, (left, right -- prod)) { | ||
| // TODO: Don't pop from the stack before DEOPF_IF() calls. |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you generate peeks at the beginning of the instruction and pops at the end just before the pushes?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's why this is still a draft PR. :-)
@brandtbucher @markshannon: In bytecodes.c I now get red wiggles on every use of a variable defined through a stack effect (since the PyObject *value; (etc.) is not in the instruction body any more). It's particularly annoying when it occurs in a macro like Py_DECREF(value) -- the wiggle shows on Py_DECREF. Any suggestions?
@markshannon I have some questions about how ERROR_IF() should work. Your spec says "If an ERROR_IF occurs, all values will be removed from the stack." It's easy enough to add a STACK_SHRINK() call (see latest code in generated_cases.c.h), but I'm not sure about whose responsibility it should be to DECREF() those variables. In practice, in the dozen or so instructions I've converted so far, when ERROR_IF() is called the code has already called DECREF(). Example definition:
instr(UNARY_POSITIVE, (value -- res)) {
res = PyNumber_Positive(value);
Py_DECREF(value);
ERROR_IF(res == NULL, error);
}
This expands to the following:
TARGET(UNARY_POSITIVE) {
PyObject *value = PEEK(1);
PyObject *res;
res = PyNumber_Positive(value);
Py_DECREF(value);
if (res == NULL) { STACK_SHRINK(1); goto error; }
POKE(1, res);
DISPATCH();
}
Shall we make this part of the spec for ERROR_IF(), that you must call it after "consuming" all the inputs?
Perhaps we could make inst a variadic macro that turns a list of stack items into a declaration list? Downside is that you need some marker to separate in/out parameters, and handling braces might get trickier.
On my phone now, but something like:
typedef PyObject *_dummy_stack_item; #define inst_begin(NAME, ...) \ case (NAME): { \ _dummy_stack_item __VA_ARGS__; #define inst_end } inst_begin(BINARY_OP, lhs, rhs, _, res) // Implementation goes here... inst_end
| if (TOP() == NULL) { | ||
| goto error; | ||
| } | ||
| ERROR_IF(TOP() == NULL, error); |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TOP() —> res
Some notes after converting a few basic instructions (and failing to convert a few outliers).
- Converting "raw" instruction definitions (with PUSH()/POP() or custom stack operations) to streamlined DSL with input and output stack effects is a slow manual process that requires careful review (e.g. the mistake that Irit found).
- So far, every few instructions I converted required changes to the code generator.
- The code generator needs to be refactored to make future changes easier.
- I haven't even started to think about how to implement array and conditional stack effects or cache streams.
- The families are currently not read by the code generator. We can address this once we need them.
For some reason I no longer see red wiggly underlines in bytecodes.c.(EDIT: After closing and reopening the file they are back.)- There are some instructions that don't seem to fit in the DSL.
- PUSH_NULL must wait until I've implemented types, since it pushes a NULL.
- LIST_APPEND and SET_ADD dig up a stack entry that occurs 'oparg' deep.
- Opcodes like BINARY_SUBSCR_ADAPTIVE are problematic since they have special exits (DISPATCH_SAME_OPARG and GO_TO_INSTRUCTION). We may have to rethink such exits.
At this point I think the way forward is to merge this and then iterate, leaving the hardest cases for last.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to justify not adding more features; smaller PRs are better. There is no need to convert all the instructions at once.
Let's add features to the code generator as we actually need them.
Have you benchmarked this?
Some of the instructions, particularly the BINARY_OP ones have been quite sensitive to minor code re-orderings.
Wiggly lines can be fixed by adding dummy static definitions to the top of bytecodes.c
|
|
||
| // stack effect: (__0 -- ) | ||
| inst(BINARY_OP_INPLACE_ADD_UNICODE) { | ||
| // This is a weird one. It's a super-instruction for |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe drop the "This is a weird one."
It is unusual, but its there for a good reason, which is to maintain the historical behavior that s += ... in a loop is not quadratic.
| predictions = set() | ||
| for inst in instrs: | ||
| for target in re.findall(r"(?:PREDICT|GO_TO_INSTRUCTION)\((\w+)\)", inst.block.text): | ||
| def write_instr(instr: InstDef, predictions: set[str], indent: str, f: TextIO, dedent: int = 0): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for future PRs.
We need to factor out the three parts:
- analysis
- translation
- output
| # Write the body | ||
| ninputs = len(instr.inputs or ()) | ||
| for line in blocklines: | ||
| if m := re.match(r"(\s*)ERROR_IF\(([^,]+), (\w+)\);\s*$", line): |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment in generated_cases.c.h about introducing code into the if (cond) goto... code.
| } | ||
| SET_TOP(res); | ||
| Py_DECREF(container); | ||
| if (res == NULL) { STACK_SHRINK(3); goto error; } |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anything more than if (cond) goto ... introduces extra jumps around the conditional block and may slow things down.
E.g.
if (res == NULL) { STACK_SHRINK(3); goto error; }
will be lowered to something like:
if (res != NULL) goto next; STACK_SHRINK(3); goto error; next:
The C compiler might move the STACK_SHRINK(3); goto error; out of line, but I think it better to do this in the code generator. Something like:
if (res == NULL) goto pop3_error; ... pop3_error: STACK_SHRINK(1); pop2_error: STACK_SHRINK(1); pop_error: STACK_SHRINK(1); error: ...