I started poking at the patch a little and have a few comments.
My understanding of the issue comments is that the read error actually happens when reading in the *source* file and *not* the bytecode file. This happens because 'ferror' is not checked after receiving an EOF and thus we think we just have an empty source file. I can understand how creating a reproducible test case for this error path would be very difficult.
So, checking for errors with 'ferror' definitely seems reasonable, but why do it in the tokenizer code? I already see several places in 'fileobject.c' that do similar checks. For example, in 'get_line' I see:
while ( buf != end && (c = GETC(fp)) != EOF ) {
...
}
if (c == EOF) {
if (ferror(fp) && errno == EINTR) {
...
}
}
As such, wouldn't handling this error case directly in 'Py_UniversalNewlineFgets' similar to the above code be more appropriate?