◐ Shell
clean mode source ↗

Message 283271 - Python tracker

It seems like encodings.normalize_encoding() currently has no unit test! Before modifying it, I would prefer to see a few unit tests:

* " utf 8 "
* "UtF 8"
* "utf8\xE9"
* etc.

Since we are talking about an optimmization, I would like to see a benchmark result before/after. I also would like to test Marc-Andre's idea of exposing the C function _Py_normalize_encoding().

_Py_normalize_encoding() works on a byte string encoded to Latin1. To implement encodings.normalize_encoding(), we might rewrite the function to work on Py_UCS4 character, or have a fast version on char*, and a more generic version for UCS2 and UCS4?