Issue 8715: Create PyUnicode_EncodeFSDefault() function
Created on 2010-05-14 16:53 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| pyunicode_encodefsdefault-3.patch | vstinner, 2010-05-14 16:56 | |||
| Messages (6) | |||
|---|---|---|---|
| msg105721 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-05-14 16:53 | |
PyUnicode_EncodeFSDefault() is the opposite of PyUnicode_DecodeFSDefault(AndSize)() and is similar to the new function os.fsencode(). As you can see in the patch, it simplifies many functions.
/* Encodes a Unicode object to Py_FileSystemDefaultEncoding with the
"surrogateescape" error handler and returns a bytes object.
If Py_FileSystemDefaultEncoding is not set, fall back to UTF-8.
*/
PyAPI_FUNC(PyObject*) PyUnicode_EncodeFSDefault(
PyObject *unicode
);
The function unify the behaviour when Py_FileSystemDefaultEncoding is NULL: use UTF-8 whereas import uses ASCII. Other functions did already fall back to UTF-8: PyUnicode_AsEncodedString() uses PyUnicode_GetDefaultEncoding() (hardcoded to utf8 in Python3) if encoding is NULL
The patch does also fix tkinter module initializer (use surrogateescape error handler, instead of strict).
The patch was first attached to issue #8611.
|
|||
| msg105722 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-05-14 16:56 | |
Ooops, I attached the wrong version of the patch. Version 3 changes the documentation (Encodes => Encode). |
|||
| msg105779 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-05-15 00:10 | |
Notes for myself: - "Encodes" and "fallback" in .h documentation => "Encode", "fall back" - bootstrap failure on Windows: import did use default error handler, it uses surrogateescape error handler, but PyUnicode_EncodeString() doesn't have codec "fast-path" for MBCS+surrogateescape. |
|||
| msg105809 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-05-15 13:17 | |
> bootstrap failure on Windows: import did use default error handler, > it uses surrogateescape error handler, but PyUnicode_EncodeString() > doesn't have codec "fast-path" for MBCS+surrogateescape. I enabled "shortcuts" in PyUnicode_EncodeString() for any error handler (not only the default error handler, strict) in r81192. |
|||
| msg105810 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-05-15 13:23 | |
PyUnicode_AsEncodedString() contains a special path for the file system encoding. I don't think that it is still needed, but I don't know how to check that.
/* During bootstrap, we may need to find the encodings
package, to load the file system encoding, and require the
file system encoding in order to load the encodings
package.
Break out of this dependency by assuming that the path to
the encodings module is ASCII-only. XXX could try wcstombs
instead, if the file system encoding is the locale's
encoding. */
else if (Py_FileSystemDefaultEncoding &&
strcmp(encoding, Py_FileSystemDefaultEncoding) == 0 &&
!PyThreadState_GET()->interp->codecs_initialized)
return PyUnicode_EncodeASCII(PyUnicode_AS_UNICODE(unicode),
PyUnicode_GET_SIZE(unicode),
errors);
|
|||
| msg105819 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2010-05-15 16:28 | |
Commited as r81194 (py3k), blocked in 3.1 (r81195). |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:57:01 | admin | set | github: 52961 |
| 2010-05-15 16:28:41 | vstinner | set | status: open -> closed resolution: fixed messages: + msg105819 |
| 2010-05-15 13:23:17 | vstinner | set | messages: + msg105810 |
| 2010-05-15 13:17:10 | vstinner | set | messages: + msg105809 |
| 2010-05-15 12:40:20 | vstinner | link | issue8725 dependencies |
| 2010-05-15 00:10:03 | vstinner | set | messages: + msg105779 |
| 2010-05-14 16:56:39 | vstinner | set | files: - pyunicode_encodefsdefault-2.patch |
| 2010-05-14 16:56:33 | vstinner | set | files:
+ pyunicode_encodefsdefault-3.patch messages: + msg105722 |
| 2010-05-14 16:53:52 | vstinner | create | |
