◐ Shell
reader mode source ↗
Skip to content

GH-133231: Changes to executor management to support proposed sys._jit module#133287

Merged
markshannon merged 8 commits into
python:mainfrom
faster-cpython:current-executor
May 4, 2025
Merged

GH-133231: Changes to executor management to support proposed sys._jit module#133287
markshannon merged 8 commits into
python:mainfrom
faster-cpython:current-executor

Conversation

@markshannon

@markshannon markshannon commented May 2, 2025

Copy link
Copy Markdown
Member

This PR:

  • Tracks the current executor on the thread-state, not the previous one
  • Sets the current executor to NULL when entering PyEval_EvalDefault and resetting it when leaving.
  • Batch executors for deallocation to avoid having to constantly incref executors; this is an ad-hoc form of deferred reference counting.

Should simplify the implementation of sys._jit.is_enabled() as it is simply tstate->current_executor != NULL and doesn't require additional state to be tracked.

… Batch executors for deallocation to avoid having to constantly incref executors; this is an ad-hoc form of deferred reference counting.
@markshannon

Copy link
Copy Markdown
Member Author

The msvc jit failure might be fixed by #132746

@Fidget-Spinner

Copy link
Copy Markdown
Member

The msvc jit failure might be fixed by #132746

Wow I did not expect that to fix it.

@brandtbucher brandtbucher left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

I think the overall approach is neat, but I think there are a couple of potential issues (and possible cleanups) here.

Long story short, anything that uses the entry_frame for this purpose is going to run into issues with the tail-calling interpreter (whether it's your approach, or mine). It's a hard problem.

@bedevere-app

bedevere-app Bot commented May 3, 2025

Copy link
Copy Markdown

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@brandtbucher

Copy link
Copy Markdown
Member

Idead: if we have a struct struct { PyExecutorObject *executor, struct node *prev} node;, we can stack allocate one of these in every GOTO_TIER_TWO call and restore it afterwards. Then keep the top on the tstate.

Something like this:

diff --git a/Python/ceval_macros.h b/Python/ceval_macros.h
index e1d2673848c..c77c2cafee7 100644
--- a/Python/ceval_macros.h
+++ b/Python/ceval_macros.h
@@ -360,10 +360,13 @@ do {                                                   \
     OPT_STAT_INC(traces_executed);                     \
     _PyExecutorObject *_executor = (EXECUTOR);         \
     jit_func jitted = _executor->jit_code;             \
+    node prev = tstate->current_executor;              \
+    tstate->current_executor = (node){_executor, &prev}; \
     /* Keep the shim frame alive via the executor: */  \
     Py_INCREF(_executor);                              \
     next_instr = jitted(frame, stack_pointer, tstate); \
     Py_DECREF(_executor);                              \
+    tstate->current_executor = prev;                   \
     Py_CLEAR(tstate->previous_executor);               \
     frame = tstate->current_frame;                     \
     stack_pointer = _PyFrame_GetStackPointer(frame);   \
@@ -379,6 +382,9 @@ do { \
     OPT_STAT_INC(traces_executed); \
     next_uop = (EXECUTOR)->trace; \
     assert(next_uop->opcode == _START_EXECUTOR); \
+    /* prev is declared local to the tier two interpreter: */ \
+    prev = tstate->current_executor; \
+    tstate->current_executor = (node){_executor, &prev}; \
     goto enter_tier_two; \
 } while (0)
 #endif
@@ -386,6 +392,7 @@ do { \
 #define GOTO_TIER_ONE(TARGET)                                         \
     do                                                                \
     {                                                                 \
+        tstate->current_executor = prev;                              \
         next_instr = (TARGET);                                        \
         OPT_HIST(trace_uop_execution_counter, trace_run_length_hist); \
         _PyFrame_SetStackPointer(frame, stack_pointer);               \

No entry frame needed.

@markshannon

Copy link
Copy Markdown
Member Author

I don't see how that would work, as macros don't have scope. But its moot anyway as we can use the entry frame.

The entry frame exists in the tail-calling interpreter, it is the entry variable that we can't use. But we don't need to. Anywhere we return, either in INTERPRETER_EXIT or exit_unwind, the frame is the entry frame: frame == &entry.frame.

@markshannon

Copy link
Copy Markdown
Member Author

I have made the requested changes; please review again

@bedevere-app

bedevere-app Bot commented May 3, 2025

Copy link
Copy Markdown

Thanks for making the requested changes!

@brandtbucher: please review the changes made to this pull request.

@bedevere-app bedevere-app Bot requested a review from brandtbucher May 3, 2025 10:53
@brandtbucher

Copy link
Copy Markdown
Member

I don't see how that would work, as macros don't have scope.

I mean, it would just be a local declared in the _PyEval_EvalFrameDefault itself for the tail-calling interpreter, and would become local to whatever bytecode is using the macro on other builds.

But as you say...

But its moot anyway as we can use the entry frame.

The entry frame exists in the tail-calling interpreter, it is the entry variable that we can't use. But we don't need to. Anywhere we return, either in INTERPRETER_EXIT or exit_unwind, the frame is the entry frame: frame == &entry.frame.

Ah, this is what I missed. Makes sense: we're always able to set it when entering because we have entry handy. And we're always able to reset it when exiting because it's just the current frame. Thanks for explaining.

@brandtbucher brandtbucher left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Looks good, thanks. I'm just going to kick off some JIT benchmarks to be safe.

@brandtbucher

Copy link
Copy Markdown
Member

Hide details View details @markshannon markshannon merged commit ac7d5ba into python:main May 4, 2025
71 checks passed
diegorusso added a commit to diegorusso/cpython that referenced this pull request May 4, 2025
* origin/main: (111 commits)
  pythongh-91048: Add filename and line number to external inspection routines (pythonGH-133385)
  pythongh-131178: Add tests for `ast` command-line interface (python#133329)
  Regenerate pcbuild.sln in Visual Studio 2022 (python#133394)
  pythongh-133042: disable HACL* HMAC on Emscripten (python#133064)
  pythongh-133351: Fix remote PDB's multi-line block tab completion (python#133387)
  pythongh-109700: Improve stress tests for interpreter creation (pythonGH-109946)
  pythongh-81793: Skip tests for os.link() to symlink on Android (pythonGH-133388)
  pythongh-126835: Rename `ast_opt.c` to `ast_preprocess.c` and related stuff after moving const folding to the peephole optimizier (python#131830)
  pythongh-91048: Relax test_async_global_awaited_by to fix flakyness (python#133368)
  pythongh-132457: make staticmethod and classmethod generic (python#132460)
  pythongh-132805: annotationlib: Fix handling of non-constant values in FORWARDREF (python#132812)
  pythongh-132426: Add get_annotate_from_class_namespace replacing get_annotate_function (python#132490)
  pythongh-81793: Always call linkat() from os.link(), if available (pythonGH-132517)
  pythongh-122559: Synchronize C and Python implementation of the io module about pickling (pythonGH-122628)
  pythongh-69605: Add PyREPL import autocomplete feature to 'What's New' (python#133358)
  bpo-44172: Keep reference to original window in curses subwindow objects (pythonGH-26226)
  pythonGH-133231: Changes to executor management to support proposed `sys._jit` module (pythonGH-133287)
  pythongh-133363: Fix Cmd completion for lines beginning with `! ` (python#133364)
  pythongh-132983: Introduce `_zstd` bindings module (pythonGH-133027)
  pythonGH-91048: Add utils for printing the call stack for asyncio tasks (python#133284)
  ...
@markshannon markshannon deleted the current-executor branch May 5, 2025 11:53
Pranjal095 pushed a commit to Pranjal095/cpython that referenced this pull request Jul 12, 2025
…sys._jit` module (pythonGH-133287)

* Track the current executor, not the previous one, on the thread-state. 

* Batch executors for deallocation to avoid having to constantly incref executors; this is an ad-hoc form of deferred reference counting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants