gh-96268: Fix loading invalid UTF-8#96270
Conversation
This makes tokenizer.c:valid_utf8 match stringlib/codecs.h:decode_utf8. This also fixes the related test so it will always detect the expected failure and error message.
gvanrossum
left a comment
There was a problem hiding this comment.
Got me nerd-sniped. :-)
Sorry, something went wrong.
|
@pablogsal: I leave it to you to decide whether this is backported to 3.11. If we don't backport, I'll file a separate PR for 3.11 to make the tests pass on buildbots with |
Sorry, something went wrong.
gvanrossum
left a comment
There was a problem hiding this comment.
I'll let @pablogsal decide about the 3.11 and 3.10 backports. (It would be less risky to backport just the lineno fix perhaps?)
Sorry, something went wrong.
|
🤖 New build scheduled with the buildbot fleet by @gvanrossum for commit f8e9e6e 🤖 If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again. |
Sorry, something went wrong.
gvanrossum
left a comment
There was a problem hiding this comment.
Thanks. I think it's time to merge this.
Sorry, something went wrong.
|
Thanks @mdboom for the PR, and @gvanrossum for merging it 🌮🎉.. I'm working now to backport this PR to: 3.11. |
Sorry, something went wrong.
This makes tokenizer.c:valid_utf8 match stringlib/codecs.h:decode_utf8. It also fixes an off-by-one error introduced in 3.10 for the line number when the tokenizer reports bad UTF8. (cherry picked from commit 8bc356a) Co-authored-by: Michael Droettboom <mdboom@gmail.com>
This makes tokenizer.c:valid_utf8 match stringlib/codecs.h:decode_utf8. It also fixes an off-by-one error introduced in 3.10 for the line number when the tokenizer reports bad UTF8. (cherry picked from commit 8bc356a) Co-authored-by: Michael Droettboom <mdboom@gmail.com>
This makes tokenizer.c:valid_utf8 match stringlib/codecs.h:decode_utf8.
This also fixes the related test so it will always detect the expected failure
and error message.