gh-105481: add flags to each instr in the opcode metadata table, to replace opcode.hasarg/hasname/hasconst by iritkatriel · Pull Request #105482 · python/cpython
Conversation
This is not yet working for pseudo instructions. Should we add them to bytecodes.c in some form?
Also, I needed to add a silly assert to YIELD_VALUE (which is irregular) to make it look to the code generator like it uses oparg. The bytecode doesn't use the oparg but oparg is set to the exception stack depth, so the opcode needs to be considered as HAS_ARG because there are assertions that oparg is 0 for instructions without args.
This is not yet working for pseudo instructions. Should we add them to bytecodes.c in some form?
I think so (unless all the flags are always false for all pseudo-ops, which I doubt is the case). In fact I think we should probably propose a solution for moving the canonical definition of the numeric opcode values before we move on this.
(Other than that, this approach seems fine.)
Also I don’t know how to identify the jump opcodes in the generator. JUMPBY is used for caches so it’s no help. Maybe we could have a SKIP_CACHE macro and the a jumpby is really a jump?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I don’t know how to identify the jump opcodes in the generator. JUMPBY is used for caches so it’s no help. Maybe we could have a SKIP_CACHE macro and the a jumpby is really a jump?
Yeah, disambiguating these through their macro names sounds fine.
This was referenced
LG. Next steps?
next is to rebase and enable the assertions for pseudo ops. If that worksI'll check the diff, maybe this PR is done then.
Maybe we should make the parser or generator complain if you use frame->f_code->co_names or frame->f_code->co_consts in a bytecode definition?
Maybe we should make the parser or generator complain if you use
frame->f_code->co_namesorframe->f_code->co_constsin a bytecode definition?
How would it know? Maybe the original idea of looking for that sequence of tokens isn't so bad. Just don't fold it into variable_used -- make it a separate helper function.
Maybe we should make the parser or generator complain if you use
frame->f_code->co_namesorframe->f_code->co_constsin a bytecode definition?How would it know? Maybe the original idea of looking for that sequence of tokens isn't so bad. Just don't fold it into
variable_used-- make it a separate helper function.
I didn't like that frame->f_code->co_names means something different then code = frame->f_code; code->co_names.
We could make it illegal to access frame->f_code from the code, and add #defines for everything you might need from there.
I grepped and I see about a dozen places where frame->f_code is used to access something other than co_consts or co_names. So you'd need to review those.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Undraft, add skip news label, go!
| [SEND] = { true, INSTR_FMT_IBC, HAS_ARG_FLAG }, | ||
| [SEND_GEN] = { true, INSTR_FMT_IBC, HAS_ARG_FLAG }, | ||
| [INSTRUMENTED_YIELD_VALUE] = { true, INSTR_FMT_IB, HAS_ARG_FLAG }, | ||
| [YIELD_VALUE] = { true, INSTR_FMT_IB, HAS_ARG_FLAG }, |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed that this PR changed the format of YIELD_VALUE/INSTRUMENTED_YIELD_VALUE from INSTR_FMT_IX to INSTR_FMT_IB. This is because we added the assertion that makes the generator identify it has HAS_ARG. I guess the new value is correct?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In opcode.py it is classified as having an arg (by being >= HAVE_ARGUMENT) so I think it's correct. The oparg is used elsewhere to indicate the stack depth.