`Monitor._thread_lock` leaked into forked children causing deadlocks
How do you use Sentry?
Sentry Saas (sentry.io)
Version
2.57.0
Steps to Reproduce
When os.fork() is called while thread in the parent is inside with self._thread_lock:, the child inherits the lock locked. The thread that originally acquired it does not exist in the child, so the lock can never be released, and every subsequent call to Monitor._ensure_running in the child blocks forever.
We hit this in production with Celery's prefork pool: when a worker dies (WorkerLostError), the master's pool supervisor calls os.fork() to spawn a replacement. If a Sentry code path is mid-_ensure_running on another master thread at that exact moment, the replacement worker is poisoned. Tasks dispatched to it hang and only die at the soft time limit.
The repro below makes this deterministic by using os.register_at_fork to acquire _thread_lock in the forking thread itself.
Save as sentry_monitor_lock.py and run with uv run sentry_monitor_lock.py. Requires Redis at redis://localhost:6379/0.
# /// script # requires-python = ">=3.13" # dependencies = [ # "sentry-sdk==2.57.0", # "celery[redis]==5.6.2", # ] # /// """Repro: sentry_sdk Monitor._thread_lock leaked locked across fork().""" import logging, os, pathlib, sys, tempfile, threading from datetime import datetime, timedelta, timezone # Wrap threading.Lock BEFORE importing sentry_sdk so we can capture # the Monitor._thread_lock instance at construction time. real_Lock = threading.Lock sentry_monitor_locks = [] def tracking_sentry_monitor_lock(*a, **kw): lock = real_Lock(*a, **kw) if sys._getframe(1).f_code.co_filename.endswith("/sentry_sdk/monitor.py"): sentry_monitor_locks.append(lock) return lock threading.Lock = tracking_sentry_monitor_lock import sentry_sdk from celery import Celery from sentry_sdk.integrations.celery import CeleryIntegration TEMP_DIR = pathlib.Path(tempfile.gettempdir()) / "sentry_monitor_lock" TEMP_DIR.mkdir(exist_ok=True, parents=True) KILL_FLAG = TEMP_DIR / "flag" KILL_FLAG.unlink(missing_ok=True) sentry_sdk.init( dsn="http://public@localhost/1", integrations=[CeleryIntegration()], traces_sample_rate=1.0, ) assert sentry_monitor_locks, "Monitor lock not captured during init()" # Acquire the lock in the forking thread itself; child inherits it # locked because no after_in_child release is registered. def before_fork(): if KILL_FLAG.exists(): sentry_monitor_locks[0].acquire() def after_in_parent(): if KILL_FLAG.exists(): KILL_FLAG.unlink(missing_ok=True) sentry_monitor_locks[0].release() os.register_at_fork(before=before_fork, after_in_parent=after_in_parent) app = Celery("sentry_monitor_lock", broker="redis://localhost:6379/0") app.conf.task_soft_time_limit = 5 app.conf.beat_schedule = {"noop": {"task": "sentry_monitor_lock.noop", "schedule": 5}} @app.task def kill_me(): KILL_FLAG.touch() os._exit(9) @app.task def noop(): return "ok" kill_me.apply_async(eta=datetime.now(timezone.utc) + timedelta(seconds=30)) if __name__ == "__main__": app.start([ "worker", "--beat", f"--schedule={TEMP_DIR}/schedule.db", "--pool=prefork", "--concurrency=1", "--prefetch-multiplier=1", "--loglevel=INFO", ])
Timeline when run:
0 s— celery starts, beat schedulesnoopevery 5 s, master spawns initial worker.nooptasks succeed.30 s—kill_meruns in the worker, touchesKILL_FLAG, thenos._exit(9)s.~30 s— master's billiard supervisor seesWorkerLostErrorand callsos.fork()to spawn a replacement. Thebefore_forkhook fires (gated onKILL_FLAG) and acquiresMonitor._thread_lock. Child inherits the lock locked. Parent releases inafter_in_parentand recovers.30 s onward— beat keeps dispatchingnoopto the replacement worker. Every task hangs.
Expected Result
The replacement worker should process noop tasks normally after the post-kill_me fork. Monitor._thread_lock state in the child must not depend on what threads in the parent were doing at the instant of fork.
Actual Result
The replacement worker is permanently wedged. Every noop it picks up hangs and dies on the soft time limit:
[2026-04-27 12:59:05,848: INFO/Beat] Scheduler: Sending due task noop (sentry_monitor_lock.noop) [2026-04-27 12:59:05,857: INFO/MainProcess] Task sentry_monitor_lock.noop[bf207be7-1240-4bc2-b73e-ded5e6d37534] received [2026-04-27 12:59:07,096: WARNING/MainProcess] Soft time limit (5s) exceeded for sentry_monitor_lock.noop[fac600de-e6c7-4533-90da-6e28a22b4fd4] [2026-04-27 12:59:07,102: ERROR/MainProcess] Task handler raised error: SoftTimeLimitExceeded() billiard.einfo.RemoteTraceback: """ Traceback (most recent call last): File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/billiard/pool.py", line 362, in workloop result = (True, prepare_result(fun(*args, **kwargs))) ~~~^^^^^^^^^^^^^^^^^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/celery/app/trace.py", line 694, in fast_trace_task R, I, T, Rstr = tasks[task].__trace__( ~~~~~~~~~~~~~~~~~~~~~^ uuid, args, kwargs, request, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ) ^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/utils.py", line 1900, in runner return sentry_patched_function(*args, **kwargs) File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/integrations/celery/__init__.py", line 332, in _inner with sentry_sdk.start_transaction( ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ transaction, ^^^^^^^^^^^^ ...<8 lines>... }, ^^ ): ^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/api.py", line 408, in start_transaction return get_current_scope().start_transaction( ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ transaction, instrumenter, custom_sampling_context, **kwargs ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ) ^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/scope.py", line 1134, in start_transaction transaction._set_initial_sampling_decision(sampling_context=sampling_context) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/tracing.py", line 1216, in _set_initial_sampling_decision self.sample_rate /= 2**client.monitor.downsample_factor ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/monitor.py", line 109, in downsample_factor self._ensure_running() ~~~~~~~~~~~~~~~~~~~~^^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/monitor.py", line 51, in _ensure_running with self._thread_lock: ^^^^^^^^^^^^^^^^^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/billiard/pool.py", line 228, in soft_timeout_sighandler raise SoftTimeLimitExceeded() billiard.exceptions.SoftTimeLimitExceeded: SoftTimeLimitExceeded() """ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/billiard/pool.py", line 362, in workloop result = (True, prepare_result(fun(*args, **kwargs))) ~~~^^^^^^^^^^^^^^^^^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/celery/app/trace.py", line 694, in fast_trace_task R, I, T, Rstr = tasks[task].__trace__( ~~~~~~~~~~~~~~~~~~~~~^ uuid, args, kwargs, request, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ) ^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/utils.py", line 1900, in runner return sentry_patched_function(*args, **kwargs) File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/integrations/celery/__init__.py", line 332, in _inner with sentry_sdk.start_transaction( ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ transaction, ^^^^^^^^^^^^ ...<8 lines>... }, ^^ ): ^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/api.py", line 408, in start_transaction return get_current_scope().start_transaction( ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ transaction, instrumenter, custom_sampling_context, **kwargs ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ) ^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/scope.py", line 1134, in start_transaction transaction._set_initial_sampling_decision(sampling_context=sampling_context) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/tracing.py", line 1216, in _set_initial_sampling_decision self.sample_rate /= 2**client.monitor.downsample_factor ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/monitor.py", line 109, in downsample_factor self._ensure_running() ~~~~~~~~~~~~~~~~~~~~^^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/sentry_sdk/monitor.py", line 51, in _ensure_running with self._thread_lock: ^^^^^^^^^^^^^^^^^ File "/Users/lv/.cache/uv/environments-v2/sentry-monitor-lock-647cb2fd2bb3ef20/lib/python3.14/site-packages/billiard/pool.py", line 228, in soft_timeout_sighandler raise SoftTimeLimitExceeded() billiard.exceptions.SoftTimeLimitExceeded: SoftTimeLimitExceeded()