Issue 28604: localeconv() doesn't support LC_MONETARY encoding different than LC_CTYPE encoding
Created on 2016-11-03 21:26 by Guillaume Pasquet (Etenil), last changed 2022-04-11 14:58 by admin. This issue is now closed.
Messages (11)
msg280023 - (view)
Author: Guillaume Pasquet (Etenil) (Guillaume Pasquet (Etenil))
Date: 2016-11-03 21:26
Date: 2016-11-03 22:21
Date: 2018-11-20 12:27
Date: 2018-11-20 12:47
Date: 2018-11-20 14:10
Date: 2018-11-20 15:20
Date: 2018-11-20 21:06
Date: 2018-11-20 21:36
Date: 2018-11-21 11:26
Date: 2018-11-28 16:52
This issue was originally reported on Fedora's Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1391280 Description of problem: After switching the monetary locale to en_GB, python then raises an exception when calling locale.localeconv() Version-Release number of selected component (if applicable): 3.5.2-4.fc25 How reproducible: Every time Steps to Reproduce: 1. Write a python3 script or open the interactive interpreter with "python3" 2. Enter the following import locale locale.setlocale(locale.LC_MONETARY, 'en_GB') locale.localeconv() 3. Observe that python raises an encoding exception Actual results: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib64/python3.5/locale.py", line 110, in localeconv d = _localeconv() UnicodeDecodeError: 'locale' codec can't decode byte 0xa3 in position 0: Invalid or incomplete multibyte or wide character Expected results: A dictionary of locale data similar to (for en_US): {'mon_thousands_sep': ',', 'currency_symbol': '$', 'negative_sign': '-', 'p_sep_by_space': 0, 'frac_digits': 2, 'int_frac_digits': 2, 'decimal_point': '.', 'mon_decimal_point': '.', 'positive_sign': '', 'p_cs_precedes': 1, 'p_sign_posn': 1, 'mon_grouping': [3, 3, 0], 'n_cs_precedes': 1, 'n_sign_posn': 1, 'grouping': [3, 3, 0], 'thousands_sep': ',', 'int_curr_symbol': 'USD ', 'n_sep_by_space': 0} Note: This was reproduced on Linux Mint 18 (python 3.5.2), and also on Fedora with python 3.4 and python 3.6 (compiled).msg280028 - (view) Author: Serhiy Storchaka (serhiy.storchaka) *
Date: 2016-11-03 22:21
I suspect this issue is similar to issue25812. en_GB has non-ut8 encoding (likely iso8859-1). Currency symbol £ is encoded with this encoding as b'\xa3'. But Python tries to decode b'\xa3' with an encoding determined by other locale setting (LC_CTYPE).msg303419 - (view) Author: Andreas Schwab (schwab) * Date: 2017-09-30 19:24
This causes test_float.py to fail with glibc > 2.26. ERROR: test_float_with_comma (__main__.GeneralFloatCases) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/abuild/rpmbuild/BUILD/Python-3.6.2/Lib/test/support/__init__.py", line 1590, in inner return func(*args, **kwds) File "Lib/test/test_float.py", line 150, in test_float_with_comma if not locale.localeconv()['decimal_point'] == ',': File "/home/abuild/rpmbuild/BUILD/Python-3.6.2/Lib/locale.py", line 110, in localeconv d = _localeconv() UnicodeDecodeError: 'locale' codec can't decode byte 0xa0 in position 0: Invalid or incomplete multibyte or wide character ----------------------------------------------------------------------msg330128 - (view) Author: STINNER Victor (vstinner) *
Date: 2018-11-20 12:27
Example of the bug:
import locale
# LC_CTYPE: latin1 encoding
locale.setlocale(locale.LC_ALL, "en_GB")
# LC_MONETARY: utf8 encoding
locale.setlocale(locale.LC_MONETARY, "ar_SA.UTF-8")
lc = locale.localeconv()
for attr in (
"mon_grouping",
"int_curr_symbol",
"currency_symbol",
"mon_decimal_point",
"mon_thousands_sep",
):
print(f"{attr}: {lc[attr]!a}")
Python 3.7 output:
mon_grouping: []
int_curr_symbol: 'SAR '
currency_symbol: '\xd8\xb1.\xd8\xb3'
mon_decimal_point: '.'
mon_thousands_sep: ''
Expected output:
mon_grouping: []
int_curr_symbol: 'SAR '
currency_symbol: '\u0631.\u0633'
mon_decimal_point: '.'
mon_thousands_sep: ''
Tested on Fedora 29.
msg330129 - (view)
Author: STINNER Victor (vstinner) *
Date: 2018-11-20 12:47
See also bpo-33954: float.__format__('n') fails with _PyUnicode_CheckConsistency assertion error for locales with non-ascii thousands separator. It may be nice to fix these two bugs at the same times, since they are related :-)msg330131 - (view) Author: STINNER Victor (vstinner) *
Date: 2018-11-20 14:10
I tested manually PR 10606: LC_ALL= LC_CTYPE=xxx LC_MONETARY=xxx ./python -c 'import locale; locale.setlocale(locale.LC_ALL, ""); print(ascii(locale.localeconv()["currency_symbol"]))' '\xa3' Result (bug = result/error without the fix): * LC_CTYPE=en_GB, LC_MONETARY=ar_SA.UTF-8: currency_symbol='\u0631.\u0633' (bug: '\xd8\xb1.\xd8\xb3') * LC_CTYPE=en_GB, LC_MONETARY=fr_FR.UTF-8: currency_symbol='\u20ac' (bug: '\xe2\x82\xac') * LC_CTYPE=en_GB, LC_MONETARY=uk_UA.koi8u: currency_symbol='\u0433\u0440\u043d.' (bug: '\xc7\xd2\xce.') * LC_CTYPE=fr_FR.UTF-8, LC_MONETARY=uk_UA.koi8u: currency_symbol='\u0433\u0440\u043d.' (bug: UnicodeDecodeError) Locale encodings: * en_GB: latin1 * ar_SA.UTF-8: utf8 * fr_FR.UTF-8: utf8 * uk_UA.koi8u: KOI8-Umsg330132 - (view) Author: STINNER Victor (vstinner) *
Date: 2018-11-20 15:20
New changeset 02e6bf7f2025cddcbde6432f6b6396198ab313f4 by Victor Stinner in branch 'master': bpo-28604: Fix localeconv() for different LC_MONETARY (GH-10606) https://github.com/python/cpython/commit/02e6bf7f2025cddcbde6432f6b6396198ab313f4msg330153 - (view) Author: STINNER Victor (vstinner) *
Date: 2018-11-20 21:06
New changeset 6eff6b8eecd7a8eccad16419269fa18ec820922e by Victor Stinner in branch '3.7': bpo-28604: Fix localeconv() for different LC_MONETARY (GH-10606) (GH-10619) https://github.com/python/cpython/commit/6eff6b8eecd7a8eccad16419269fa18ec820922emsg330155 - (view) Author: STINNER Victor (vstinner) *
Date: 2018-11-20 21:36
New changeset df3051b53fd7f2862a4087f5449e811d8421347a by Victor Stinner in branch '3.6': bpo-28604: Fix localeconv() for different LC_MONETARY (GH-10606) (GH-10619) (GH-10621) https://github.com/python/cpython/commit/df3051b53fd7f2862a4087f5449e811d8421347amsg330191 - (view) Author: STINNER Victor (vstinner) *
Date: 2018-11-21 11:26
It seems like my change introduced a regression: bpo-35290.msg330609 - (view) Author: STINNER Victor (vstinner) *
Date: 2018-11-28 16:52
See also bpo-31900: localeconv() should decode numeric fields from LC_NUMERIC encoding, not from LC_CTYPE encoding.
History
Date
User
Action
Args
2022-04-11 14:58:39adminsetgithub: 72790
2018-11-28 16:52:24vstinnersetmessages:
+ msg330609
2018-11-28 16:51:47vstinnersettitle: Exception raised by python3.5 when using en_GB locale -> localeconv() doesn't support LC_MONETARY encoding different than LC_CTYPE encoding
2018-11-21 11:26:12vstinnersetmessages:
+ msg330191
2018-11-20 21:37:25vstinnersetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved 2018-11-20 21:36:19vstinnersetmessages: + msg330155 2018-11-20 21:08:55vstinnersetpull_requests: + pull_request9869 2018-11-20 21:06:25vstinnersetmessages: + msg330153 2018-11-20 20:14:32vstinnersetpull_requests: + pull_request9867 2018-11-20 15:20:28vstinnersetmessages: + msg330132 2018-11-20 14:10:13vstinnersetversions: + Python 3.8, - Python 3.5 2018-11-20 14:10:05vstinnersetmessages: + msg330131 2018-11-20 12:47:44vstinnersetmessages: + msg330129 2018-11-20 12:36:21vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request9849 2018-11-20 12:27:56vstinnersetmessages: + msg330128 2018-10-01 13:53:35xtreaksetnosy: + xtreak
2018-09-24 12:30:14petr.viktorinsetnosy: + vstinner
2017-09-30 19:24:02schwabsetnosy: + schwab
messages: + msg303419
2016-11-04 10:50:40cstrataksetnosy: + cstratak
2016-11-03 22:21:49serhiy.storchakasetnosy: + loewis, serhiy.storchaka, lemburg
resolution: fixed
stage: patch review -> resolved 2018-11-20 21:36:19vstinnersetmessages: + msg330155 2018-11-20 21:08:55vstinnersetpull_requests: + pull_request9869 2018-11-20 21:06:25vstinnersetmessages: + msg330153 2018-11-20 20:14:32vstinnersetpull_requests: + pull_request9867 2018-11-20 15:20:28vstinnersetmessages: + msg330132 2018-11-20 14:10:13vstinnersetversions: + Python 3.8, - Python 3.5 2018-11-20 14:10:05vstinnersetmessages: + msg330131 2018-11-20 12:47:44vstinnersetmessages: + msg330129 2018-11-20 12:36:21vstinnersetkeywords: + patch
stage: patch review
pull_requests: + pull_request9849 2018-11-20 12:27:56vstinnersetmessages: + msg330128 2018-10-01 13:53:35xtreaksetnosy: + xtreak
2018-09-24 12:30:14petr.viktorinsetnosy: + vstinner
2017-09-30 19:24:02schwabsetnosy: + schwab
messages: + msg303419
2016-11-04 10:50:40cstrataksetnosy: + cstratak
2016-11-03 22:21:49serhiy.storchakasetnosy: + loewis, serhiy.storchaka, lemburg
messages:
+ msg280028
versions:
+ Python 3.7, - Python 3.4