Note directly related to this issue, but untabify.py fails on files that contain non-ascii characters. For example:
$ ./python.exe Tools/scripts/untabify.py Modules/_heapqmodule.c
Traceback (most recent call last):
...
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe7 in position 173: invalid continuation byte
I am not sure what relevant C standard has to say about using non-ascii characters in comments, but the checking tool should not fail with a traceback in such situation.