◐ Shell
clean mode source ↗

Message 205670 - Python tracker

I didn't understand Serhiy's "ls" example. I tried:

$ mkdir unicode
$ cd unicode
$ python3 -c 'open("ab\xe9.txt", "w").close()'
$ python3 -c 'open("euro\u20ac.txt", "w").close()'
$ ls
abé.txt  euro€.txt
$ LANG=C ls
ab??.txt  euro???.txt


Ah yes, I didn't remember that "ls" is aware of the locale encoding.

printf() and wprintf() behave differently on unencodable/undecoable characters:
http://unicodebook.readthedocs.org/en/latest/programming_languages.html#printf-functions-family

Again, the issue is not specific to Python. So it's time to learn how to configure correctly your locales.

About the "interoperability" point I mentionned in my first message ("This encoding is the best choice for interopability with other (python2 or non python) programs."): if you work around the annoying ASCII encoding by forcing UTF-8 encoding, Python may produce data which would be incompatible with other applications following POSIX and so using the ASCII encoding.