Thanks for the detailed analysis, Phil. I think the results are pretty conclusive: daemon threads are the worst. :) But seriously, thanks.
As you demonstrated, it isn't just Python "daemon" threads that cause the problem. It is essentially any external access of the C-API once runtime finalization has started. The docs [1] aren't super clear about it, but there are some fundamental assumptions we make about runtime finalization:
* no use of the C-API while Py_FinalizeEx() is executing (except for a few helpers like Py_Initialized)
* only a small portion of the C-API is available afterward (at least until Py_Initialize() is run)
I guess the real question is what to do about this?
Given that this is essentially a separate problem, let's move further discussion and effort over related to sorting out problematic threads to #36476, "Runtime finalization assumes all other threads have exited." @Phil, would you mind attaching those same two files to that issue?
[1] https://docs.python.org/3/c-api/init.html#c.Py_FinalizeEx