bpo-28749: Fixed the documentation of the mapping codec APIs. (#487) by serhiy-storchaka · Pull Request #714 · python/cpython
Decoding mappings must map single string characters to single Unicode characters, integers (which are then interpreted as Unicode ordinals) or ``None`` (meaning "undefined mapping" and causing an error).
Encoding mappings must map single Unicode characters to single string characters, integers (which are then interpreted as Latin-1 ordinals) or ``None`` (meaning "undefined mapping" and causing an error).
The mapping objects provided must only support the __getitem__ mapping interface.
If a character lookup fails with a LookupError, the character is copied as-is meaning that its ordinal value will be interpreted as Unicode or Latin-1 ordinal resp. Because of this, mappings only need to contain those mappings which map characters to different code points. decode characters. The mapping objects provided must support the :meth:`__getitem__` mapping interface; dictionaries and sequences work well.
These are the mapping codec APIs:
.. c:function:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, \ .. c:function:: PyObject* PyUnicode_DecodeCharmap(const char *data, Py_ssize_t size, \ PyObject *mapping, const char *errors)
Create a Unicode object by decoding *size* bytes of the encoded string *s* using the given *mapping* object. Return *NULL* if an exception was raised by the codec. If *mapping* is *NULL* latin-1 decoding will be done. Else it can be a dictionary mapping byte or a unicode string, which is treated as a lookup table. Byte values greater that the length of the string and U+FFFE "characters" are treated as "undefined mapping". Create a Unicode object by decoding *size* bytes of the encoded string *s* using the given *mapping* object. Return *NULL* if an exception was raised by the codec.
If *mapping* is *NULL*, Latin-1 decoding will be applied. Else *mapping* must map bytes ordinals (integers in the range from 0 to 255) to Unicode strings, integers (which are then interpreted as Unicode ordinals) or ``None``. Unmapped data bytes -- ones which cause a :exc:`LookupError`, as well as ones which get mapped to ``None``, ``0xFFFE`` or ``'\ufffe'``, are treated as undefined mappings and cause an error.
.. c:function:: PyObject* PyUnicode_AsCharmapString(PyObject *unicode, PyObject *mapping)
Encode a Unicode object using the given *mapping* object and return the result as Python string object. Error handling is "strict". Return *NULL* if an Encode a Unicode object using the given *mapping* object and return the result as a bytes object. Error handling is "strict". Return *NULL* if an exception was raised by the codec.
The following codec API is special in that maps Unicode to Unicode.
The *mapping* object must map Unicode ordinal integers to bytes objects, integers in the range from 0 to 255 or ``None``. Unmapped character ordinals (ones which cause a :exc:`LookupError`) as well as mapped to ``None`` are treated as "undefined mapping" and cause an error.
.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, \ PyObject *table, const char *errors)
Translate a :c:type:`Py_UNICODE` buffer of the given *size* by applying a character mapping *table* to it and return the resulting Unicode object. Return *NULL* when an exception was raised by the codec.
The *mapping* table must map Unicode ordinal integers to Unicode ordinal integers or ``None`` (causing deletion of the character). .. c:function:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, \ PyObject *mapping, const char *errors)
Mapping tables need only provide the :meth:`__getitem__` interface; dictionaries and sequences work well. Unmapped character ordinals (ones which cause a :exc:`LookupError`) are left untouched and are copied as-is. Encode the :c:type:`Py_UNICODE` buffer of the given *size* using the given *mapping* object and return the result as a bytes object. Return *NULL* if an exception was raised by the codec.
.. deprecated-removed:: 3.3 4.0 Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using :c:func:`PyUnicode_Translate`. or :ref:`generic codec based API <codec-registry>` :c:func:`PyUnicode_AsCharmapString` or :c:func:`PyUnicode_AsEncodedString`.
.. c:function:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, \ The following codec API is special in that maps Unicode to Unicode.
.. c:function:: PyObject* PyUnicode_Translate(PyObject *unicode, \ PyObject *mapping, const char *errors)
Encode the :c:type:`Py_UNICODE` buffer of the given *size* using the given *mapping* object and return a Python string object. Return *NULL* if an exception was raised by the codec. Translate a Unicode object using the given *mapping* object and return the resulting Unicode object. Return *NULL* if an exception was raised by the codec.
The *mapping* object must map Unicode ordinal integers to Unicode strings, integers (which are then interpreted as Unicode ordinals) or ``None`` (causing deletion of the character). Unmapped character ordinals (ones which cause a :exc:`LookupError`) are left untouched and are copied as-is.
.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, \ PyObject *mapping, const char *errors)
Translate a :c:type:`Py_UNICODE` buffer of the given *size* by applying a character *mapping* table to it and return the resulting Unicode object. Return *NULL* when an exception was raised by the codec.
.. deprecated-removed:: 3.3 4.0 Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using :c:func:`PyUnicode_AsCharmapString` or :c:func:`PyUnicode_AsEncodedString`. :c:func:`PyUnicode_Translate`. or :ref:`generic codec based API <codec-registry>`
MBCS codecs for Windows