◐ Shell
reader mode source ↗
Skip to content

gh-105481: Generate the opcode lists in dis from data extracted from bytecodes.c#106758

Merged
iritkatriel merged 30 commits into
python:mainfrom
iritkatriel:opcode_py
Jul 18, 2023
Merged

gh-105481: Generate the opcode lists in dis from data extracted from bytecodes.c#106758
iritkatriel merged 30 commits into
python:mainfrom
iritkatriel:opcode_py

Conversation

@iritkatriel

@iritkatriel iritkatriel commented Jul 14, 2023

Copy link
Copy Markdown
Member

The generated lists are more correct than the (old) hard-coded ones, which were not maintained properly.

For example:

Old:

>>> [opcode.opname[op] for op in opcode.haslocal]
['LOAD_FAST', 'STORE_FAST', 'DELETE_FAST', 'LOAD_FAST_CHECK', 'LOAD_FAST_AND_CLEAR', 'STORE_FAST_MAYBE_NULL', 'LOAD_CLOSURE']

New:

>>> [opcode.opname[op] for op in opcode.haslocal]
['LOAD_FAST', 'STORE_FAST', 'DELETE_FAST', 'LOAD_FAST_CHECK', 'MAKE_CELL', 'LOAD_DEREF', 'STORE_DEREF', 'DELETE_DEREF', 'LOAD_FAST_AND_CLEAR', 'LOAD_FAST_LOAD_FAST', 'STORE_FAST_LOAD_FAST', 'STORE_FAST_STORE_FAST', 'LOAD_FROM_DICT_OR_DEREF', 'STORE_FAST_MAYBE_NULL', 'LOAD_CLOSURE']

@iritkatriel

Copy link
Copy Markdown
Member Author

The generated lists are more correct than the (old) hard-coded ones, which were not maintained properly.

For example:

Old:

>>> [opcode.opname[op] for op in opcode.haslocal]
['LOAD_FAST', 'STORE_FAST', 'DELETE_FAST', 'LOAD_FAST_CHECK', 'LOAD_FAST_AND_CLEAR', 'STORE_FAST_MAYBE_NULL', 'LOAD_CLOSURE']

New:

>>> [opcode.opname[op] for op in opcode.haslocal]
['LOAD_FAST', 'STORE_FAST', 'DELETE_FAST', 'LOAD_FAST_CHECK', 'MAKE_CELL', 'LOAD_DEREF', 'STORE_DEREF', 'DELETE_DEREF', 'LOAD_FAST_AND_CLEAR', 'LOAD_FAST_LOAD_FAST', 'STORE_FAST_LOAD_FAST', 'STORE_FAST_STORE_FAST', 'LOAD_FROM_DICT_OR_DEREF', 'STORE_FAST_MAYBE_NULL', 'LOAD_CLOSURE']

Maybe in this case actually the new list is incorrect - all the opcodes with DEREF seem to belong to hasfree rather than haslocal:

[opcode.opname[op] for op in opcode.hasfree]
['MAKE_CELL', 'LOAD_DEREF', 'STORE_DEREF', 'DELETE_DEREF', '<148>', 'LOAD_FROM_DICT_OR_DEREF']

@gvanrossum

Copy link
Copy Markdown
Member

The generated lists are more correct than the (old) hard-coded ones, which were not maintained properly.
For example:
Old:

>>> [opcode.opname[op] for op in opcode.haslocal]
['LOAD_FAST', 'STORE_FAST', 'DELETE_FAST', 'LOAD_FAST_CHECK', 'LOAD_FAST_AND_CLEAR', 'STORE_FAST_MAYBE_NULL', 'LOAD_CLOSURE']

New:

>>> [opcode.opname[op] for op in opcode.haslocal]
['LOAD_FAST', 'STORE_FAST', 'DELETE_FAST', 'LOAD_FAST_CHECK', 'MAKE_CELL', 'LOAD_DEREF', 'STORE_DEREF', 'DELETE_DEREF', 'LOAD_FAST_AND_CLEAR', 'LOAD_FAST_LOAD_FAST', 'STORE_FAST_LOAD_FAST', 'STORE_FAST_STORE_FAST', 'LOAD_FROM_DICT_OR_DEREF', 'STORE_FAST_MAYBE_NULL', 'LOAD_CLOSURE']

Maybe in this case actually the new list is incorrect - all the opcodes with DEREF seem to belong to hasfree rather than haslocal:

[opcode.opname[op] for op in opcode.hasfree]
['MAKE_CELL', 'LOAD_DEREF', 'STORE_DEREF', 'DELETE_DEREF', '<148>', 'LOAD_FROM_DICT_OR_DEREF']

Ugh. Worse, now LOAD_CLOSURE is gone, it’s ambiguous. Who uses this? How? (Only dis.py?)

@iritkatriel

Copy link
Copy Markdown
Member Author

Ugh. Worse, now LOAD_CLOSURE is gone, it’s ambiguous. Who uses this? How? (Only dis.py?)

LOAD_CLOSURE is just an alias (pseudo op) for LOAD_FAST. It's in the haslocal list in the old and new version, no change there.

@iritkatriel iritkatriel marked this pull request as draft July 14, 2023 21:34
@iritkatriel iritkatriel changed the title gh-105481: add haslocal to _opcode.py. Generate most oplists in opcode.py with the functions in _opcode.py Jul 14, 2023
@iritkatriel

Copy link
Copy Markdown
Member Author

Now it's like this:

Old:

>>> sorted([opcode.opname[op] for op in opcode.hasfree])
['<148>', 'DELETE_DEREF', 'LOAD_DEREF', 'LOAD_FROM_DICT_OR_DEREF', 'MAKE_CELL', 'STORE_DEREF']
>>> sorted([opcode.opname[op] for op in opcode.haslocal])
['DELETE_FAST', 'LOAD_CLOSURE', 'LOAD_FAST', 'LOAD_FAST_AND_CLEAR', 'LOAD_FAST_CHECK', 'STORE_FAST', 'STORE_FAST_MAYBE_NULL']

New:

>>> sorted([opcode.opname[op] for op in opcode.hasfree])
['DELETE_DEREF', 'LOAD_DEREF', 'LOAD_FROM_DICT_OR_DEREF', 'MAKE_CELL', 'STORE_DEREF']
>>> sorted([opcode.opname[op] for op in opcode.haslocal])
['DELETE_FAST', 'LOAD_CLOSURE', 'LOAD_FAST', 'LOAD_FAST_AND_CLEAR', 'LOAD_FAST_CHECK', 'LOAD_FAST_LOAD_FAST', 'STORE_FAST', 'STORE_FAST_LOAD_FAST', 'STORE_FAST_MAYBE_NULL', 'STORE_FAST_STORE_FAST']

The "<148>" is because in opcode.py in main we have hasfree.append(148) even though there is no such opcode.

In the haslocal list, we see that the old list is missing the super instructions.

@iritkatriel

Copy link
Copy Markdown
Member Author

For the other lists:

Old:

>>> sorted([opcode.opname[op] for op in opcode.hasconst])
['KW_NAMES', 'LOAD_CONST', 'RETURN_CONST']

New:

>>> sorted([opcode.opname[op] for op in opcode.hasconst])
['INSTRUMENTED_RETURN_CONST', 'KW_NAMES', 'LOAD_CONST', 'RETURN_CONST']

old is missing the INSTRUMENTED_*.

Old:

>>> sorted([opcode.opname[op] for op in opcode.hasname])
['DELETE_ATTR', 'DELETE_GLOBAL', 'DELETE_NAME', 'IMPORT_FROM', 'IMPORT_NAME', 'LOAD_ATTR', 'LOAD_FROM_DICT_OR_GLOBALS', 'LOAD_GLOBAL', 'LOAD_METHOD', 'LOAD_NAME', 'LOAD_SUPER_ATTR', 'LOAD_SUPER_METHOD', 'LOAD_ZERO_SUPER_ATTR', 'LOAD_ZERO_SUPER_METHOD', 'STORE_ATTR', 'STORE_GLOBAL', 'STORE_NAME']

New:

>>> sorted([opcode.opname[op] for op in opcode.hasname])
['DELETE_ATTR', 'DELETE_GLOBAL', 'DELETE_NAME', 'IMPORT_FROM', 'IMPORT_NAME', 'LOAD_ATTR', 'LOAD_FROM_DICT_OR_GLOBALS', 'LOAD_GLOBAL', 'LOAD_METHOD', 'LOAD_NAME', 'LOAD_SUPER_ATTR', 'LOAD_SUPER_METHOD', 'LOAD_ZERO_SUPER_ATTR', 'LOAD_ZERO_SUPER_METHOD', 'STORE_ATTR', 'STORE_GLOBAL', 'STORE_NAME']

Old is missing LOAD_ZERO_SUPER_ATTR and LOAD_ZERO_SUPER_METHOD.

Old:

>>> sorted([opcode.opname[op] for op in opcode.hasjrel])
['FOR_ITER', 'JUMP', 'JUMP_BACKWARD', 'JUMP_BACKWARD_NO_INTERRUPT', 'JUMP_FORWARD', 'JUMP_NO_INTERRUPT', 'POP_JUMP_IF_FALSE', 'POP_JUMP_IF_NONE', 'POP_JUMP_IF_NOT_NONE', 'POP_JUMP_IF_TRUE', 'SEND']

New:

>>> sorted([opcode.opname[op] for op in opcode.hasjrel])
['ENTER_EXECUTOR', 'FOR_ITER', 'JUMP', 'JUMP_BACKWARD', 'JUMP_BACKWARD_NO_INTERRUPT', 'JUMP_FORWARD', 'JUMP_NO_INTERRUPT', 'POP_JUMP_IF_FALSE', 'POP_JUMP_IF_NONE', 'POP_JUMP_IF_NOT_NONE', 'POP_JUMP_IF_TRUE', 'SEND']

New has ENTER_EXECUTOR as a jump.

For hasarg, the old list contained a few that the new list does not:

{'INSTRUMENTED_CALL_FUNCTION_EX', 'INSTRUMENTED_RETURN_VALUE', 'INSTRUMENTED_LINE', 'INSTRUMENTED_END_SEND', 'INSTRUMENTED_INSTRUCTION', 'INSTRUMENTED_END_FOR'}

@iritkatriel

Copy link
Copy Markdown
Member Author

It mostly looks right, except for INSTRUMENTED_CALL_FUNCTION_EX, which delegates the work to CALL_FUNCTION_EX which does have an arg.

@gvanrossum

Copy link
Copy Markdown
Member

Ugh. Worse, now LOAD_CLOSURE is gone, it’s ambiguous. Who uses this? How? (Only dis.py?)

LOAD_CLOSURE is just an alias (pseudo op) for LOAD_FAST. It's in the haslocal list in the old and new version, no change there.

My point is that LOAD_CLOSURE used to be separate because it was in the hasfree list. Now some LOAD_FAST instances refer to a local and others to a free variable. Example:

>>> def foo(a):
...   print(a)
...   def bar(): return a
... 
>>> dis.dis(foo)
              0 MAKE_CELL                0 (a)

  1           2 RESUME                   0

  2           4 LOAD_GLOBAL              1 (NULL + print)
             14 LOAD_DEREF               0 (a)
             16 CALL                     1
             24 POP_TOP

  3          26 LOAD_FAST                0 (a)
             28 BUILD_TUPLE              1
             30 LOAD_CONST               1 (<code object bar at 0x101d18400, file "<stdin>", line 3>)
             32 MAKE_FUNCTION
             34 SET_FUNCTION_ATTRIBUTE   8 (closure)
             36 STORE_FAST               1 (bar)
             38 RETURN_CONST             0 (None)

Disassembly of <code object bar at 0x101d18400, file "<stdin>", line 3>:
              0 COPY_FREE_VARS           1

  3           2 RESUME                   0
              4 LOAD_DEREF               0 (a)
              6 RETURN_VALUE
>>> 

The LOAD_FAST (a) at offset 26 used to be a LOAD_CLOSURE in 3.12 and before (and in 3.13 until LOAD_CLOSURE was made a pseudo-op). It's harmless, because we have co_varnames indexed by both locals and cells. So maybe hasfree is no longer useful?

I don't think

@gvanrossum

Copy link
Copy Markdown
Member

The "<148>" is because in opcode.py in main we have hasfree.append(148) even though there is no such opcode.

Looks like a relic. In 3.11, 148 was LOAD_CLASSDEREF. That opcode no longer exists in 3.12 (it was removed in gh-103764 for PEP 695). So let's get rid of it.

@iritkatriel

Copy link
Copy Markdown
Member Author

The LOAD_FAST (a) at offset 26 used to be a LOAD_CLOSURE in 3.12 and before (and in 3.13 until LOAD_CLOSURE was made a pseudo-op). It's harmless, because we have co_varnames indexed by both locals and cells. So maybe hasfree is no longer useful?

So we merge hasfree into haslocal and remove hasfree? Or leave it empty and soft deprecate it?

@iritkatriel iritkatriel changed the title gh-105481: add haslocal/hasfree to _opcode.py. Generate most oplists in opcode.py with the functions in _opcode.py Jul 15, 2023
@gvanrossum

Copy link
Copy Markdown
Member

So we merge hasfree into haslocal and remove hasfree? Or leave it empty and soft deprecate it?

Hm, I was just writing up a rationale for this when I realized that we should just keep hasfree and populate it correctly, and explain the special situation where LOAD_FAST can be used to load a cell (previously LOAD_CLOSURE) separately.

@iritkatriel

iritkatriel commented Jul 15, 2023

Copy link
Copy Markdown
Member Author

Ok, then this PR is probably fine w.r.t. hasfree and haslocal.

The only list in oplists now is hascompare, which contains only COMPARE_OP:

>>> [opcode.opname[op] for op in opcode.hascompare]
['COMPARE_OP']

It is used by dis to identify that its op (>=, ==, etc) needs to be displayed.

@gvanrossum

Copy link
Copy Markdown
Member

It is used by dis to identify that its op (>=, ==, etc) needs to be displayed.

Since there's only one opcode like this, maybe dis can just check whether it's that opcode? Or do you anticipate that in the future there will be multiple ones? (Were there ever?)

@iritkatriel

Copy link
Copy Markdown
Member Author

It is used by dis to identify that its op (>=, ==, etc) needs to be displayed.

Since there's only one opcode like this, maybe dis can just check whether it's that opcode? Or do you anticipate that in the future there will be multiple ones? (Were there ever?)

It's just this one all the way back to 3.8.

dis can stop using the list, but it's documented in the dis module so can we remove it?

@gvanrossum

Copy link
Copy Markdown
Member

dis can stop using the list, but it's documented in the dis module so can we remove it?

I guess deprecate it like hasjabs/hasjrel.

25 hidden items Load more…
@iritkatriel iritkatriel changed the title gh-105481: add has_local/has_free/etc to _opcode.py. Generate oplists in opcode.py with the functions in _opcode.py Jul 17, 2023
@iritkatriel

Copy link
Copy Markdown
Member Author

I think I need to add in whatsnew that the dis module's opcode lists are no longer also defined in the opcode module. The opcode module is not documented, but if you look on GitHub you can see some places where people import it.

@iritkatriel

Copy link
Copy Markdown
Member Author

I think I need to add in whatsnew that the dis module's opcode lists are no longer also defined in the opcode module. The opcode module is not documented, but if you look on GitHub you can see some places where people import it.

Or maybe this is a breaking change that I can't make without deprecation? I'm not sure what the status of the opcode module is.

I can put the lists back in opcode, and instead work to remove the dependency of the build on opcode.

@gvanrossum

Copy link
Copy Markdown
Member

I have a feeling that opcode is actually a better place than dis for the official public API. And that's probably why people have been importing it despite its undocumented status.

…ta instead of opcode for _specialized_instructions"

This reverts commit 5e1c4a4.
…rom _opcode_metadata rather than via opcode"

This reverts commit d525c53.
@iritkatriel

Copy link
Copy Markdown
Member Author

OK, I reverted the last few commits so we leave everything in opcode. We will rid the build of opcode in a future PR once the opcode.h file is generated from bytecodes.c.

@iritkatriel iritkatriel merged commit 40f3f11 into python:main Jul 18, 2023
@iritkatriel iritkatriel deleted the opcode_py branch July 25, 2023 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants