◐ Shell
clean mode source ↗

gh-103092: Add a mutex to make the random state of rotatingtree concurrent-safe by aisk · Pull Request #115301 · python/cpython

Codes

import sys
import threading
import _xxsubinterpreters
import cProfile



code = """
def f():
    import re
    import json
    import pickle
    d = {str(x): {x: x} for x in range(1000)}
    for _ in range(100):
        re.compile("foo|bar")
        json.loads(json.dumps(d))
        pickle.loads(pickle.dumps(d))

"""


def run_single():
    ctx = {}
    exec(code, ctx)
    cProfile.runctx("f()", ctx, {})


def run_multi():
    ts = []
    interps = []

    for _ in range(4):
        interp = _xxsubinterpreters.create(isolated=1)
        interps.append(interp)
        c = code + "import cProfile; cProfile.run('f()')"
        t = threading.Thread(target=_xxsubinterpreters.run_string, args=[int(interp), c])
        t.start()
        ts.append(t)

    for t in ts:
        t.join()

    for interp in interps:
        _xxsubinterpreters.destroy(int(interp))


if len(sys.argv) > 1 and sys.argv[1] == 'multi':
    run_multi()
else:
    run_single()

Single interpreter:

base (b104360):

❯ ./python.exe foo.py
         12726 function calls (12464 primitive calls) in 96.526 seconds

current:

❯ ./python.exe foo.py
         12726 function calls (12464 primitive calls) in 97.027 seconds

Multiple interpreters:

I'm using a 4 physical core Intel MacBook. As the code above shows, 4 isolated interpreters are used for this benchmark.

base (b104360):

With

{Py_mod_multiple_interpreters, Py_MOD_MULTIPLE_INTERPRETERS_NOT_SUPPORTED},
//{Py_mod_multiple_interpreters, Py_MOD_PER_INTERPRETER_GIL_SUPPORTED},
modified to enable its PER_INTERPRETER_GIL_SUPPORTED.

Although this is not safe, I think it doesn't matter for this microbenchmark.

./python.exe foo.py multi
         13086 function calls (12807 primitive calls) in 118.085 seconds

All 4 interpreter finished in the same seconds (+-0.x seconds).

current:

❯ ./python.exe foo.py multi
         13086 function calls (12807 primitive calls) in 115.202 seconds

Summary

On my machine, the execution time of the code before and after the modification varies, sometimes better and sometimes worse. I believe that the introduced performance difference falls within the observable error range.