gh-109052: Use the base opcode when comparing code objects#109107

gaogaotiantian

After instrumentation, the opcode of the code object would become INSTRUMENTED_LINE or INSTRUMENTED_INSTRUCTION, we should make sure to get the actual opcode when we compare them.

This also belongs to gh-107265.

Issue: test_sys_settrace -R 3:3 does crash #109052

corona10

Looks better?

diff --git a/Objects/codeobject.c b/Objects/codeobject.c
index 70a0c2ebd6..4180d216a3 100644
--- a/Objects/codeobject.c
+++ b/Objects/codeobject.c
@@ -1788,20 +1788,24 @@ code_richcompare(PyObject *self, PyObject *other, int op)
         if (co_code == ENTER_EXECUTOR) {
             const int exec_index = co_arg;
             _PyExecutorObject *exec = co->co_executors->executors[exec_index];
-            co_code = exec->vm_data.opcode;
+            co_code = _PyOpcode_Deopt[exec->vm_data.opcode];
             co_arg = exec->vm_data.oparg;
         }
+        else {
+            co_code = _Py_GetBaseOpcode(co, i);
+        }
         assert(co_code != ENTER_EXECUTOR);
-        co_code = _PyOpcode_Deopt[co_code];

         if (cp_code == ENTER_EXECUTOR) {
             const int exec_index = cp_arg;
             _PyExecutorObject *exec = cp->co_executors->executors[exec_index];
-            cp_code = exec->vm_data.opcode;
+            cp_code = _PyOpcode_Deopt[exec->vm_data.opcode];
             cp_arg = exec->vm_data.oparg;
         }
+        else {
+            cp_code = _Py_GetBaseOpcode(cp, i);
+        }
         assert(cp_code != ENTER_EXECUTOR);
-        cp_code = _PyOpcode_Deopt[cp_code];

         if (co_code != cp_code || co_arg != cp_arg) {
             goto unequal;

bedevere-bot

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes, you will be put in the comfy chair!

gaogaotiantian

I don't think the current code is "wrong" functionality wise, as deopt is safe for base instructions - you can do multiple deopts and it'll just be the same base opcode. However, it did deopt twice for the opcode, so I moved the deopt into the ENTER_EXECUTOR check. Notice that if the opted opcode is ENTER_EXECUTOR, it would still be ENTER_EXECUTOR after deopted.

I'm not sure if ENTER_EXECUTOR will be instrumented by instruction instrumentation (I would guess so), if it will, then we have to deopt the instruction before ENTER_EXECUTOR check, or it won't recognize it.

vstinner

I dislike this approach, iterating on each bytecode at a specific index, deoptimize it, and then look for the next bytecode. The problem is that there are cases where i++ doesn't give you a bytecode but a cache :-( You have to take care of ENTER_EXECUTOR. The exact implementation of bytecode changes often these days, and it's hard to keep all functions consuming bytecode to remain correct.

Would it be possible instead to write a function which creates a copy of the deoptimized code in one shot? Something like PyCode_GetOriginalBytecode() which would create a bytes object.

vstinner

The problem is that there are cases where i++ doesn't give you a bytecode but a cache :-(

I'm thinking at bug gh-107082 which has been fixed by commit 233b878. cc @gvanrossum

gaogaotiantian

I dislike this approach, iterating on each bytecode at a specific index, deoptimize it, and then look for the next bytecode. The problem is that there are cases where i++ doesn't give you a bytecode but a cache :-( You have to take care of ENTER_EXECUTOR. The exact implementation of bytecode changes often these days, and it's hard to keep all functions consuming bytecode to remain correct.

Would it be possible instead to write a function which creates a copy of the deoptimized code in one shot? Something like PyCode_GetOriginalBytecode() which would create a bytes object.

I'm not saying this is the best way to do it (it was there already to compare the code objects), but the logic here is correct (at least for now). i is specifically taken care of to avoid getting cache instructions in the loop.

A function like PyCode_GetOriginalBytecode() would just put this logic into another abstracted function right? We still need to keep track of everything when instructions changed. It just brings the problem to another person/piece of code.

vstinner

A function like PyCode_GetOriginalBytecode() would just put this logic into another abstracted function right? We still need to keep track of everything when instructions changed. It just brings the problem to another person/piece of code.

Correct, but it's easier to implement when you iterate on a single code object, and this function can be reused in other places, and it can be moved closer to functions which modify bytecode.

gaogaotiantian

Also I believe this piece of code was originally written to avoid allocating a new piece of memory. A new function returning bytes would contradict that - not saying that's the wrong way to go, just to mention.

vstinner

Also I believe this piece of code was originally written to avoid allocating a new piece of memory. A new function returning bytes would contradict that - not saying that's the wrong way to go, just to mention.

Honestly, I don't think that code1==code2 operation matters for Python performance. It's usually used in the compiler/parser, not "at runtime". It shouldn't be part of "hot code".

vstinner

By the way, for me, it's surprising that _Py_GetBaseOpcode() can still return ENTER_EXECUTOR and require to get the executor (opcode, oparg). Each caller has to take care of that. Would it be possible to have a function which returns the original base case, so don't return ENTER_EXECUTOR?

Maybe an "iterator-like" API which also gives an offset to the next "original" instruction (the next instruction which, one decoded, will give the original bytecode)?

gaogaotiantian

By the way, for me, it's surprising that _Py_GetBaseOpcode() can still return ENTER_EXECUTOR and require to get the executor (opcode, oparg). Each caller has to take care of that. Would it be possible to have a function which returns the original base case, so don't return ENTER_EXECUTOR?

Maybe an "iterator-like" API which also gives an offset to the next "original" instruction (the next instruction which, one decoded, will give the original bytecode)?

I'm just saying, this code is explicitly converted to the current status from what you wanted in bd2e47c . _PyCode_GetCode is the utility function you described and it was used before the commit :)

Do you think we should revert that change? I would guess there might be reasons behind it.

corona10

Honestly, I don't think that code1==code2 operation matters for Python performance. It's usually used in the compiler/parser, not "at runtime". It shouldn't be part of "hot code"

I don't think that copying the original object is not beneficial for the scenario as you commented.
Every time we have to optimize a code object, do we have to copy the original object? I think that comparing the code object is more rare case than optimizing itself.

corona10

Fixing itself looks good to me.
For overall policy, it would be a different issue.
I will leave the rest of review to @gvanrossum

vstinner

I prefer to abstraint myself from reviewing this PR :-) I didn't follow recent developments about optimization, so I don't have a strong opinion. I just shared my opinion and feedback on the recent code changes and issues that I saw ;-)

vstinner

I confirm that this change fix issue #109052 crash.

Without this change, Python does crash with the following command:

$ ./python -m test test_sys_settrace -R 3:3
(...)
...Fatal Python error: Segmentation fault

Current thread 0x00007f9dce557740 (most recent call first):
  File "/home/vstinner/python/main/Lib/test/test_sys_settrace.py", line 1912 in trace
  File "/home/vstinner/python/main/Lib/test/test_sys_settrace.py", line 2162 in test_jump_backwards_into_while_block
  File "/home/vstinner/python/main/Lib/test/test_sys_settrace.py", line 1975 in run_test
  File "/home/vstinner/python/main/Lib/test/test_sys_settrace.py", line 2008 in test
(...)

With this change, the test does pass as expected, and Python doesn't crash.

$ ./python -m test test_sys_settrace -R 3:3
(...)
Result: SUCCESS

brandtbucher

I think this is a good fix for the problem at hand. Zooming out, I agree that we should put all of this "deopting" logic in one place (we're pretty close with _Py_GetBaseOpcode) and actually use it everywhere.

I'll merge in a couple of hours if nobody has any other concerns.

corona10

I'll merge in a couple of hours if nobody has any other concerns.

Let's just merge!

bedevere-app

GH-112329 is a backport of this pull request to the 3.12 branch.

…ts (pythongh-109107)

Use the base opcode when comparing code objects

b18459f

gaogaotiantian requested a review from markshannon as a code owner September 7, 2023 18:46

bedevere-bot added the awaiting review label Sep 7, 2023

bedevere-bot mentioned this pull request Sep 7, 2023

test_sys_settrace -R 3:3 does crash #109052

Closed

📜🤖 Added by blurb_it.

805abf3

gaogaotiantian requested a review from brandtbucher September 7, 2023 18:49

corona10 requested a review from gvanrossum September 8, 2023 05:56

corona10 reviewed Sep 8, 2023

View reviewed changes

corona10 requested changes Sep 8, 2023

View reviewed changes

bedevere-bot removed the awaiting review label Sep 8, 2023

bedevere-bot added the awaiting changes label Sep 8, 2023

Do Deopt only once

70472db

gaogaotiantian requested a review from corona10 September 8, 2023 06:56

vstinner reviewed Sep 8, 2023

View reviewed changes

corona10 approved these changes Sep 8, 2023

View reviewed changes

bedevere-bot added awaiting merge and removed awaiting changes labels Sep 8, 2023

corona10 mentioned this pull request Sep 8, 2023

test_sys_settrace segfaults on main #109143

Closed

brandtbucher approved these changes Sep 8, 2023

View reviewed changes

corona10 merged commit 057bc72 into python:main Sep 9, 2023

bedevere-bot removed the awaiting merge label Sep 9, 2023

gaogaotiantian deleted the fix-systrace-replace branch September 9, 2023 17:46

gaogaotiantian mentioned this pull request Nov 11, 2023

Code objects have incorrect hash/equality when code is instrumented for sys.monitoring #111984

Closed

gaogaotiantian added a commit to gaogaotiantian/cpython that referenced this pull request Nov 23, 2023

[3.12] pythongh-109052: Use the base opcode when comparing code objec… …

0c3bfbd

…ts (pythongh-109107)

Conversation

gaogaotiantian commented Sep 7, 2023 • edited by gvanrossum Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

corona10 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bedevere-bot commented Sep 8, 2023

Uh oh!

gaogaotiantian commented Sep 8, 2023

Uh oh!

vstinner left a comment

Choose a reason for hiding this comment

Uh oh!

vstinner commented Sep 8, 2023

Uh oh!

gaogaotiantian commented Sep 8, 2023

Uh oh!

vstinner commented Sep 8, 2023

Uh oh!

gaogaotiantian commented Sep 8, 2023

Uh oh!

vstinner commented Sep 8, 2023

Uh oh!

vstinner commented Sep 8, 2023

Uh oh!

gaogaotiantian commented Sep 8, 2023

Uh oh!

corona10 commented Sep 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

corona10 left a comment

Choose a reason for hiding this comment

Uh oh!

vstinner commented Sep 8, 2023

Uh oh!

vstinner commented Sep 8, 2023

Uh oh!

brandtbucher left a comment

Choose a reason for hiding this comment

Uh oh!

corona10 commented Sep 9, 2023

Uh oh!

bedevere-app Bot commented Nov 23, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

gaogaotiantian commented Sep 7, 2023 •

edited by gvanrossum

Loading

corona10 left a comment •

edited

Loading

corona10 commented Sep 8, 2023 •

edited

Loading