◐ Shell
clean mode source ↗

`\r`s introduced after codec decoding cause `SystemError`

Bug report

Bug description:

Found by OSS-Fuzz.

The testcase is:

#coding=U7+AA0''

resulting in:

$ ./python -c "import py_compile; py_compile.compile('testcase')"
Sorry: SystemError: Parser/string_parser.c:286: bad argument to internal function

I gather this is because +AA0 decodes to \r in UTF-7, however, _PyTokenizer_translate_newlines is called before the codec decoding so it slips by and blows up (at least it apologizes😆). I think it could be fixed by calling _PyTokenizer_translate_newlines again after decoding if a \r is introduced, WDYT?

cc @lysnikolaou @pablogsal

CPython versions tested on:

CPython main branch

Operating systems tested on:

No response

Linked PRs