Issue 4747: SyntaxError executing a script containing non-ASCII characters in its name or path
Created on 2008-12-26 00:41 by ggenellina, last changed 2022-04-11 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| unicode_scriptname.patch | amaury.forgeotdarc, 2008-12-31 15:32 | |||
| Messages (6) | |||
|---|---|---|---|
| msg78286 - (view) | Author: Gabriel Genellina (ggenellina) | Date: 2008-12-26 00:41 | |
Attempting to directly execute a script containing non-ASCII
characters in its name or path raises SyntaxError.
The script contents are mostly irrelevant, except it must contain an
encoding declaration (with *any* encoding, real or inexistent).
Running "python foo.py" works, but invoking it directly as "foo.py"
raises `SyntaxError: None`, or sometimes `SyntaxError: encoding
problem: with BOM` (no BOM is present in the source file, a plain
ASCII text file).
C:\TEMP>cd áéíóú
C:\TEMP\áéíóú>type test.py
# -*- coding: ascii -*-
C:\TEMP\áéíóú>C:\Apps\Python30\python.exe test.py
C:\TEMP\áéíóú>test.py
SyntaxError: None
To avoid any doubt, the file has no strange characters:
C:\TEMP\áéíóú>python -c "print(repr(open('test.py','rb').read()))"
'# -*- coding: ascii -*-\r\n'
and .py files are associated with the same interpreter:
C:\TEMP\áéíóú>assoc .py
.py=Python.File
C:\TEMP\áéíóú>ftype Python.File
Python.File="C:\Apps\Python30\python.exe" "%1" %*
The same thing happens if the file name contains any non-ASCII
character (the path may be pure ASCII).
|
|||
| msg78614 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * ![]() |
Date: 2008-12-31 15:32 | |
This also happens if there is any kind of syntax error in the file: "SyntaxError: None" is printed without any other hint. The (char*) filename passed to PyRun_AnyFile should be utf-8 encoded; Otherwise the file cannot be re-opened. Attached patch fixes both issues, please review. It removes one occurrence of wcstombs in favor of the PyUnicode machinery. |
|||
| msg78617 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2008-12-31 16:03 | |
I'm unable to reproduce the problem on Linux. I wrote a
script /home/haypo/ééé/ééé.py:
---------------
#!/home/haypo/prog/SVN/py3k/python
# -*- coding: ascii -*-
print("a")
---------------
The script runs fine:
$ ./ééé.py
a
$ /home/haypo/prog/SVN/py3k/python ééé.py
a
Is the problem specific to Windows?
|
|||
| msg78621 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * ![]() |
Date: 2008-12-31 16:21 | |
Yes. As usual, the problem occurs when the platform encoding (used by wcstombs) is not utf-8. |
|||
| msg78650 - (view) | Author: Benjamin Peterson (benjamin.peterson) * ![]() |
Date: 2008-12-31 20:11 | |
Looks good. |
|||
| msg78737 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * ![]() |
Date: 2009-01-01 23:07 | |
Fixed in r68143 (py3k) and r68144 (3.0). Thanks for the report! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:56:43 | admin | set | github: 48997 |
| 2009-01-01 23:07:48 | amaury.forgeotdarc | set | status: open -> closed resolution: fixed messages: + msg78737 |
| 2008-12-31 20:11:49 | benjamin.peterson | set | keywords:
- needs review nosy: + benjamin.peterson messages: + msg78650 |
| 2008-12-31 16:21:45 | amaury.forgeotdarc | set | messages: + msg78621 |
| 2008-12-31 16:03:39 | vstinner | set | nosy:
+ vstinner messages: + msg78617 |
| 2008-12-31 15:32:21 | amaury.forgeotdarc | set | files:
+ unicode_scriptname.patch keywords: + needs review, patch messages: + msg78614 nosy: + amaury.forgeotdarc stage: patch review |
| 2008-12-26 00:41:46 | ggenellina | create | |
