gh-136702: Deprecate passing non-ascii encoding (str) to `encodings.normalize_encoding` by StanFromIreland · Pull Request #140030 · python/cpython

StanFromIreland

OK. emal doesn't call lookup directly and no tests fail without the changes.

I presume you did this to preserve backward compatibility. Unless I'm missing something, I don't think we should bother to do that. Given a non-ascii charset name, there are two possible outcomes from the current code: the name after sanitizing is not a valid codec name, or it is. If it is valid after sanitizing, there are two cases: the sanitized name results in successful decoding, or it does not. It is only the first of these second two cases that would be affected by the post-deprecation change.

How often would that case occur in reality? I would guess it would be a vanishingly small number of cases, if it ever occurs at all.

I think it will be better to remove the changes to the email package from this PR. If anyone sees the deprecation warning maybe they'll open an issue, but I'm betting nobody ever sees it from the email package. The behavior after the deprecation is over is the behavior we want: if the codec name contains non-ascii it is not a valid codec name, so any non-ascii in the text being decoded using that charset name will ultimately get turned into the 'unknown character' glyph when decoded by the email package.

gh-136702: Deprecate passing non-ascii *encoding* (str) to `encodings.normalize_encoding` by StanFromIreland · Pull Request #140030 · python/cpython

gh-136702: Deprecate passing non-ascii encoding (str) to `encodings.normalize_encoding` by StanFromIreland · Pull Request #140030 · python/cpython