◐ Shell
reader mode source ↗
Skip to content

gh-127065: Make methodcaller thread-safe and re-entrant#127245

Closed
eendebakpt wants to merge 21 commits into
python:mainfrom
eendebakpt:methodcaller_ft
Closed

gh-127065: Make methodcaller thread-safe and re-entrant#127245
eendebakpt wants to merge 21 commits into
python:mainfrom
eendebakpt:methodcaller_ft

Conversation

@eendebakpt

@eendebakpt eendebakpt commented Nov 25, 2024

Copy link
Copy Markdown
Contributor

The function operator.methodcaller is not thread-safe since the additional of the vectorcall method in ##107201. In the free-threading build the issue is easy to trigger, for the normal build harder.

We could either remove the vectorcall implementation, or adapt the vectorcall implementation to be thread safe (for both the normal and free-threading build). In this PR we make the methodcaller safe by

  • Replacing the lazy initialiation with initialization in the constructor
  • Using a stack allocated space for the vectorcall arguments, and falling back to tp_call for more than 8 arguments

Benchmark results:

call: Mean +- std dev: [no_vectorcall] 353 ns +- 8 ns -> [pr] 149 ns +- 5 ns: 2.37x faster
creation: Mean +- std dev: [no_vectorcall] 317 ns +- 8 ns -> [pr] 439 ns +- 31 ns: 1.39x slower
creation+call: Mean +- std dev: [no_vectorcall] 652 ns +- 17 ns -> [pr] 613 ns +- 14 ns: 1.06x faster
call kwarg: Mean +- std dev: [no_vectorcall] 599 ns +- 88 ns -> [pr] 199 ns +- 6 ns: 3.01x faster
creation kwarg: Mean +- std dev: [no_vectorcall] 451 ns +- 4 ns -> [pr] 653 ns +- 59 ns: 1.45x slower
creation+call kwarg: Mean +- std dev: [no_vectorcall] 1.05 us +- 0.02 us -> [pr] 858 ns +- 9 ns: 1.22x faster

Geometric mean: 1.29x faster

(old machine, no PGO, comparing this PR with the same PR but the vectorcall disabled)

Benchmark script
import pyperf

setup = """
from operator import methodcaller as mc
arr = []
call = mc('sort')
call_kwarg = mc('sort', reverse=True)
"""

runner = pyperf.Runner()
runner.timeit(name="call", stmt="call(arr)", setup=setup)
runner.timeit(name="creation", stmt="call = mc('sort')", setup=setup)
runner.timeit(name="creation+call", stmt="call = mc('sort'); call(arr)", setup=setup)
runner.timeit(name="call kwarg", stmt="call_kwarg(arr)", setup=setup)
runner.timeit(name="creation kwarg", stmt="call = mc('sort', reverse=True)", setup=setup)
runner.timeit(name="creation+call kwarg", stmt="call = mc('sort', reverse=True); call(arr)", setup=setup)

@colesbury colesbury left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

A few comments:

  • I don't think _methodcaller_initialize_vectorcall is thread safe without the GIL. Lazily initializing arguments complicates the code and thread-safety -- I don't think it's worth it.
  • Generally, I think it's better to support some fixed max number of arguments (like 8), and stack allocate the temporary array. (i.e., PyObject **tmp_args[MAX_ARGS];). If the methodcaller needs more arguments than that, just don't set the vectorcall field and let it fall back to the tp_call.

@eendebakpt eendebakpt changed the title Draft: gh-127065: Make methodcaller thread-safe and re-entrant Dec 1, 2024

@colesbury colesbury left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

Overall this looks good to me. A few minor formatting suggestions below.

eendebakpt and others added 4 commits December 6, 2024 21:53
Co-authored-by: Sam Gross <colesbury@gmail.com>
Co-authored-by: Sam Gross <colesbury@gmail.com>
@eendebakpt

Copy link
Copy Markdown
Contributor Author

Closing in favor of #127746

@eendebakpt eendebakpt closed this Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants