bpo-37751: Fix normalizestring() with hyphens and spaces converted to underscores#15092
bpo-37751: Fix normalizestring() with hyphens and spaces converted to underscores#15092vstinner merged 9 commits into
Conversation
|
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Our records indicate we have not received your CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. If you have recently signed the CLA, please wait at least one business day You can check yourself to see if the CLA has been received. Thanks again for your contribution, we look forward to reviewing it! |
Sorry, something went wrong.
shihai1991
left a comment
There was a problem hiding this comment.
All the check have been passed, so LGTM.
Sorry, something went wrong.
|
Would it be possible to reuse _Py_normalize_encoding() in codecs.c normalizestring()? |
Sorry, something went wrong.
I went through two functions and found that they do have similar functions. At the same time, I tried to modify the code and the test cases passed. The code can be modified just like 'check_force_ascii‘ calls ‘_Py_normalize_encoding‘ . And I think it's better to call _Py_normalize_encoding as an external function to other modules. There is also a process problem. I am not quite sure whether to submit another issue to discuss this modification, or can I directly modify it in this issue? |
Sorry, something went wrong.
I will do more test for reusing _Py_normalize_encoding() in codecs.c normalizestring(). Thank you for your helpful suggestions。 |
Sorry, something went wrong.
|
You can create a new PR, this one can be closed when the new one is merged. |
Sorry, something went wrong.
Let me try it. |
Sorry, something went wrong.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Sorry, something went wrong.
I think use a regular exception is better. Thanks. Co-Authored-By: Victor Stinner <vstinner@redhat.com>
|
Can you please try to add a NEWS entry using the blurb tool? (install it using: python3 -m pip install --user blurb) |
Sorry, something went wrong.
…into fix-issue-37751
Thank you for your careful guidance. I need to take a moment to familiarize myself with the blurb tool. |
Sorry, something went wrong.
|
I have made the requested changes; please review again. |
Sorry, something went wrong.
|
Thanks for making the requested changes! @vstinner: please review the changes made to this pull request. |
Sorry, something went wrong.
vstinner
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the update.
Sorry, something went wrong.
Fix codecs.lookup() to normalize the encoding name the same way than encodings.normalize_encoding(), except that codecs.lookup() also converts the name to lower case.
Fix codecs.lookup() to normalize the encoding name the same way than encodings.normalize_encoding(), except that codecs.lookup() also converts the name to lower case.
Fix codecs.lookup() to normalize the encoding name the same way than encodings.normalize_encoding(), except that codecs.lookup() also converts the name to lower case.
* Fix running with Python 3.9 Since Python 3.9 [1], codecs names are normalized in a different way. [1] python/cpython#15092 * Add: github action, bump dependencies Co-authored-by: eight04 <eight04@gmail.com>
The codecs lookup function now performs only minimal normalization of the encoding name before passing it to the serach functions: all ASCII letters are converted to lower case, spaces are replaced with hyphens. Excessive normalization broke third-party codecs providers, like python-iconv. Revert "bpo-37751: Fix codecs.lookup() normalization (pythonGH-15092)" This reverts commit 20f59fe.
The codecs lookup function now performs only minimal normalization of the encoding name before passing it to the search functions: all ASCII letters are converted to lower case, spaces are replaced with hyphens. Excessive normalization broke third-party codecs providers, like python-iconv. Revert "bpo-37751: Fix codecs.lookup() normalization (pythonGH-15092)" This reverts commit 20f59fe.
The codecs lookup function now performs only minimal normalization of the encoding name before passing it to the search functions: all ASCII letters are converted to lower case, spaces are replaced with hyphens. Excessive normalization broke third-party codecs providers, like python-iconv. Revert "bpo-37751: Fix codecs.lookup() normalization (pythonGH-15092)" This reverts commit 20f59fe.
The codecs lookup function now performs only minimal normalization of the encoding name before passing it to the search functions: all ASCII letters are converted to lower case, spaces are replaced with hyphens. Excessive normalization broke third-party codecs providers, like python-iconv. Revert "bpo-37751: Fix codecs.lookup() normalization (GH-15092)" This reverts commit 20f59fe.
…H-137167) The codecs lookup function now performs only minimal normalization of the encoding name before passing it to the search functions: all ASCII letters are converted to lower case, spaces are replaced with hyphens. Excessive normalization broke third-party codecs providers, like python-iconv. Revert "bpo-37751: Fix codecs.lookup() normalization (pythonGH-15092)" This reverts commit 20f59fe.
https://bugs.python.org/issue37751