◐ Shell
clean mode source ↗

gh-105481: add flags to each instr in the opcode metadata table, to replace opcode.hasarg/hasname/hasconst by iritkatriel · Pull Request #105482 · python/cpython

Conversation

@iritkatriel

This is not yet working for pseudo instructions. Should we add them to bytecodes.c in some form?

Also, I needed to add a silly assert to YIELD_VALUE (which is irregular) to make it look to the code generator like it uses oparg. The bytecode doesn't use the oparg but oparg is set to the exception stack depth, so the opcode needs to be considered as HAS_ARG because there are assertions that oparg is 0 for instructions without args.

…, to replace opcode.hasarg/hasname/hasconst

@gvanrossum

This is not yet working for pseudo instructions. Should we add them to bytecodes.c in some form?

I think so (unless all the flags are always false for all pseudo-ops, which I doubt is the case). In fact I think we should probably propose a solution for moving the canonical definition of the numeric opcode values before we move on this.

(Other than that, this approach seems fine.)

@iritkatriel

Also I don’t know how to identify the jump opcodes in the generator. JUMPBY is used for caches so it’s no help. Maybe we could have a SKIP_CACHE macro and the a jumpby is really a jump?

gvanrossum

gvanrossum

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don’t know how to identify the jump opcodes in the generator. JUMPBY is used for caches so it’s no help. Maybe we could have a SKIP_CACHE macro and the a jumpby is really a jump?

Yeah, disambiguating these through their macro names sounds fine.

This was referenced

Jun 11, 2023

@gvanrossum

@iritkatriel

LG. Next steps?

next is to rebase and enable the assertions for pseudo ops. If that worksI'll check the diff, maybe this PR is done then.

@iritkatriel

Maybe we should make the parser or generator complain if you use frame->f_code->co_names or frame->f_code->co_consts in a bytecode definition?

@gvanrossum

Maybe we should make the parser or generator complain if you use frame->f_code->co_names or frame->f_code->co_consts in a bytecode definition?

How would it know? Maybe the original idea of looking for that sequence of tokens isn't so bad. Just don't fold it into variable_used -- make it a separate helper function.

@iritkatriel

Maybe we should make the parser or generator complain if you use frame->f_code->co_names or frame->f_code->co_consts in a bytecode definition?

How would it know? Maybe the original idea of looking for that sequence of tokens isn't so bad. Just don't fold it into variable_used -- make it a separate helper function.

I didn't like that frame->f_code->co_names means something different then code = frame->f_code; code->co_names.
We could make it illegal to access frame->f_code from the code, and add #defines for everything you might need from there.

@gvanrossum

I grepped and I see about a dozen places where frame->f_code is used to access something other than co_consts or co_names. So you'd need to review those.

gvanrossum

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undraft, add skip news label, go!

iritkatriel

[SEND] = { true, INSTR_FMT_IBC, HAS_ARG_FLAG },
[SEND_GEN] = { true, INSTR_FMT_IBC, HAS_ARG_FLAG },
[INSTRUMENTED_YIELD_VALUE] = { true, INSTR_FMT_IB, HAS_ARG_FLAG },
[YIELD_VALUE] = { true, INSTR_FMT_IB, HAS_ARG_FLAG },

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that this PR changed the format of YIELD_VALUE/INSTRUMENTED_YIELD_VALUE from INSTR_FMT_IX to INSTR_FMT_IB. This is because we added the assertion that makes the generator identify it has HAS_ARG. I guess the new value is correct?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In opcode.py it is classified as having an arg (by being >= HAVE_ARGUMENT) so I think it's correct. The oparg is used elsewhere to indicate the stack depth.

Labels