Issue 6745: (curses) addstr() takes str in Python 3
Created on 2009-08-20 21:54 by Trundle, last changed 2022-04-11 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| umlaut2x.py | Trundle, 2009-08-20 21:54 | Umlauts working in Python 2.x | ||
| umlaut3x.py | Trundle, 2009-08-20 21:55 | Umlauts not working in Python 3.x | ||
| curses_charset.patch | vstinner, 2009-08-27 23:59 | |||
| getkey_sample.py | Trundle, 2010-11-17 11:27 | |||
| Messages (11) | |||
|---|---|---|---|
| msg91786 - (view) | Author: Andreas Stührk (Trundle) * | Date: 2009-08-20 21:54 | |
In Python 3, curses requires a str for addstr() where I think it should take bytes instead. Otherwise it is impossible to output anything other than ASCII (which is even more or less stated on top of curses' documentation). See the attached script "umlaut2x.py" for Python 2.6: Outputting umlauts works fine, both in single-byte and multi-byte environments. The attached script "umlaut3x.py" is the same script translated to Python 3. Note that the output here always seems to be utf-8, which is plain wrong. A quick test where I changed addstr() to take bytes instead of str confirmed that outputting other characters than ASCII would work then in Python 3, too. There are perhaps more places where the types are wrong. If someone confirms this issue and it is desired, I could provide a patch. |
|||
| msg92019 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2009-08-27 22:31 | |
First, make sure that your Python3 build uses libncursesw and not libncurses, because libncursesw supports unicode, whereas libncurses doesn't... On UNIX, use the following command to check this: ldd $(./python -c "import _curses; print(_curses.__file__)")|grep curses > Note that the output here always seems to be utf-8, > which is plain wrong. Yes, addstr() always uses utf8 to convert unicode to bytes. It's wrong if the terminal uses a different charset. But I'm not sure that using bytes is a better idea: since you would like to print characters, unicode is the right type. An idea would be to use a configurable charset. Eg. add a 'charset' attribute to a window (or to the module). |
|||
| msg92020 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2009-08-27 22:34 | |
See also issue #4787 |
|||
| msg92021 - (view) | Author: Andreas Stührk (Trundle) * | Date: 2009-08-27 22:59 | |
Yes, it uses a version of ncurses which supports wide characters, I checked that. I agree that using bytes instead may not be the preferred solution in Python 3. The point is, currently, it is broken if the user does not use an utf-8 environment. |
|||
| msg92023 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2009-08-27 23:15 | |
I don't really understand because your example, umlaut3x.py, works correctly on my computer (py3k, ubunty jaunty). > The point is, currently, it is broken if the user > does not use an utf-8 environment. So the problem is that the charset is hardcoded to utf8. You would like to be able to change that. Or better, than Python guess your terminal charset. Right? |
|||
| msg92024 - (view) | Author: Andreas Stührk (Trundle) * | Date: 2009-08-27 23:46 | |
Of course it works for you. As you stated in issue #4787, your locale is 'fr_FR.UTF-8'. And I don't want Python to guess my terminal's encoding. I want Python to respect my locale. Which is 'de_DE@euro', and not utf-8. |
|||
| msg92025 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2009-08-27 23:59 | |
Here is a first patch to add a method setcharset() to the window class. Using my patch, you can fix your example by adding the line: screen.setcharset(<your charset>) before addstr(). It's an initial hack to fix the issue. Next steps are: - use something better than utf8 as the default charset, maybe locale.getpreferredencoding() - copy the charset on new window creation? |
|||
| msg121318 - (view) | Author: Łukasz Langa (lukasz.langa) * ![]() |
Date: 2010-11-16 21:05 | |
We'll try to solve this for 3.2. |
|||
| msg121346 - (view) | Author: Andreas Stührk (Trundle) * | Date: 2010-11-17 11:27 | |
Note that getkey() is broken, too. I attached a simple script to demonstrate that. If you run it and enter some non-ascii input, you can see that getkey() returns an utf-8 encoded str (in my utf-8 environment at least, I haven't check if it's always utf-8 or if it depends on the locale). |
|||
| msg140380 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2011-07-14 23:21 | |
I created issue #12567 to fix the Unicode support of the curses module in Python 3. |
|||
| msg162307 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2012-06-04 23:37 | |
The issue #12567 fixed this one: - umlaut3x.py now works in Python 3.3 with an encoding different than UTF-8: Python automatically detects (and uses) the locale encoding - getkey_sample.py can be patched to handle Unicode correctly using get_wch() instead of getkey() |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:56:52 | admin | set | github: 50994 |
| 2012-06-04 23:37:38 | vstinner | set | status: open -> closed resolution: fixed messages: + msg162307 |
| 2011-10-28 08:17:35 | petri.lehtinen | set | nosy:
+ petri.lehtinen |
| 2011-07-14 23:21:51 | vstinner | set | messages: + msg140380 |
| 2010-11-17 11:27:32 | Trundle | set | files:
+ getkey_sample.py messages: + msg121346 |
| 2010-11-16 21:05:46 | lukasz.langa | set | priority: normal -> high nosy:
+ lukasz.langa assignee: lukasz.langa |
| 2009-08-27 23:59:58 | vstinner | set | files:
+ curses_charset.patch keywords: + patch messages: + msg92025 |
| 2009-08-27 23:46:22 | Trundle | set | messages: + msg92024 |
| 2009-08-27 23:15:37 | vstinner | set | messages: + msg92023 |
| 2009-08-27 22:59:26 | Trundle | set | messages: + msg92021 |
| 2009-08-27 22:34:04 | vstinner | set | messages: + msg92020 |
| 2009-08-27 22:31:55 | vstinner | set | nosy:
+ vstinner messages: + msg92019 |
| 2009-08-20 21:55:45 | Trundle | set | files: + umlaut3x.py |
| 2009-08-20 21:54:52 | Trundle | create | |
