I realized that fallback to ASCII instead of UTF-8 is not possible yet because of #8611: if it fallbacks to ASCII, it's not more possible to run Python in a non-ASCII directory. I have a patch set fixing #8611 but it's huge and complex. I will not be fixed quickly (if it would be possible someday to fix it).
My new patch fallback to utf-8 instead of ascii, even if I agree that it would be better to fallback to ascii. Improve unicode, surrogates & friends is complex, and I prefer to fix bugs step by step. I propose to first ensure that Py_FileSystemEncoding is always set, and later write a new patch to fallback to ASCII instead of UTF-8.
Patch version 5:
- fallback to utf-8 instead of ascii
- Set Py_FileSystemDefaultEncoding to NULL to Py_Finalize(): thanks to that, it should be possible to call Py_InitializeEx() (initfsencoding()) twice or more
- initfsencoding() doesn't call _PyCodec_Lookup() on get_codeset() success because get_codeset() does already call it
- explain that the fatal error is very unlikely