Issue 6745: (curses) addstr() takes str in Python 3

Created on 2009-08-20 21:54 by Trundle, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name	Uploaded	Description	Edit
umlaut2x.py	Trundle, 2009-08-20 21:54	Umlauts working in Python 2.x
umlaut3x.py	Trundle, 2009-08-20 21:55	Umlauts not working in Python 3.x
curses_charset.patch	vstinner, 2009-08-27 23:59
getkey_sample.py	Trundle, 2010-11-17 11:27

Messages (11)
msg91786 - (view)	Author: Andreas Stührk (Trundle) *	Date: 2009-08-20 21:54
In Python 3, curses requires a str for addstr() where I think it should take bytes instead. Otherwise it is impossible to output anything other than ASCII (which is even more or less stated on top of curses' documentation). See the attached script "umlaut2x.py" for Python 2.6: Outputting umlauts works fine, both in single-byte and multi-byte environments. The attached script "umlaut3x.py" is the same script translated to Python 3. Note that the output here always seems to be utf-8, which is plain wrong. A quick test where I changed addstr() to take bytes instead of str confirmed that outputting other characters than ASCII would work then in Python 3, too. There are perhaps more places where the types are wrong. If someone confirms this issue and it is desired, I could provide a patch.
msg92019 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-08-27 22:31
First, make sure that your Python3 build uses libncursesw and not libncurses, because libncursesw supports unicode, whereas libncurses doesn't... On UNIX, use the following command to check this: ldd $(./python -c "import _curses; print(_curses.__file__)")\|grep curses > Note that the output here always seems to be utf-8, > which is plain wrong. Yes, addstr() always uses utf8 to convert unicode to bytes. It's wrong if the terminal uses a different charset. But I'm not sure that using bytes is a better idea: since you would like to print characters, unicode is the right type. An idea would be to use a configurable charset. Eg. add a 'charset' attribute to a window (or to the module).
msg92020 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-08-27 22:34
See also issue #4787
msg92021 - (view)	Author: Andreas Stührk (Trundle) *	Date: 2009-08-27 22:59
Yes, it uses a version of ncurses which supports wide characters, I checked that. I agree that using bytes instead may not be the preferred solution in Python 3. The point is, currently, it is broken if the user does not use an utf-8 environment.
msg92023 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-08-27 23:15
I don't really understand because your example, umlaut3x.py, works correctly on my computer (py3k, ubunty jaunty). > The point is, currently, it is broken if the user > does not use an utf-8 environment. So the problem is that the charset is hardcoded to utf8. You would like to be able to change that. Or better, than Python guess your terminal charset. Right?
msg92024 - (view)	Author: Andreas Stührk (Trundle) *	Date: 2009-08-27 23:46
Of course it works for you. As you stated in issue #4787, your locale is 'fr_FR.UTF-8'. And I don't want Python to guess my terminal's encoding. I want Python to respect my locale. Which is 'de_DE@euro', and not utf-8.
msg92025 - (view)	Author: STINNER Victor (vstinner) *	Date: 2009-08-27 23:59
Here is a first patch to add a method setcharset() to the window class. Using my patch, you can fix your example by adding the line: screen.setcharset(<your charset>) before addstr(). It's an initial hack to fix the issue. Next steps are: - use something better than utf8 as the default charset, maybe locale.getpreferredencoding() - copy the charset on new window creation?
msg121318 - (view)	Author: Łukasz Langa (lukasz.langa) *	Date: 2010-11-16 21:05
We'll try to solve this for 3.2.
msg121346 - (view)	Author: Andreas Stührk (Trundle) *	Date: 2010-11-17 11:27
Note that getkey() is broken, too. I attached a simple script to demonstrate that. If you run it and enter some non-ascii input, you can see that getkey() returns an utf-8 encoded str (in my utf-8 environment at least, I haven't check if it's always utf-8 or if it depends on the locale).
msg140380 - (view)	Author: STINNER Victor (vstinner) *	Date: 2011-07-14 23:21
I created issue #12567 to fix the Unicode support of the curses module in Python 3.
msg162307 - (view)	Author: STINNER Victor (vstinner) *	Date: 2012-06-04 23:37
The issue #12567 fixed this one: - umlaut3x.py now works in Python 3.3 with an encoding different than UTF-8: Python automatically detects (and uses) the locale encoding - getkey_sample.py can be patched to handle Unicode correctly using get_wch() instead of getkey()

History
Date	User	Action	Args
2022-04-11 14:56:52	admin	set	github: 50994
2012-06-04 23:37:38	vstinner	set	status: open -> closed resolution: fixed messages: + msg162307
2011-10-28 08:17:35	petri.lehtinen	set	nosy: + petri.lehtinen
2011-07-14 23:21:51	vstinner	set	messages: + msg140380
2010-11-17 11:27:32	Trundle	set	files: + getkey_sample.py messages: + msg121346
2010-11-16 21:05:46	lukasz.langa	set	priority: normal -> high nosy: + lukasz.langa versions: + Python 3.2, - Python 3.1 messages: + msg121318 assignee: lukasz.langa
2009-08-27 23:59:58	vstinner	set	files: + curses_charset.patch keywords: + patch messages: + msg92025
2009-08-27 23:46:22	Trundle	set	messages: + msg92024
2009-08-27 23:15:37	vstinner	set	messages: + msg92023
2009-08-27 22:59:26	Trundle	set	messages: + msg92021
2009-08-27 22:34:04	vstinner	set	messages: + msg92020
2009-08-27 22:31:55	vstinner	set	nosy: + vstinner messages: + msg92019
2009-08-20 21:55:45	Trundle	set	files: + umlaut3x.py
2009-08-20 21:54:52	Trundle	create