gh-99508: fix `TypeError` in `Lib/importlib/_bootstrap_external.py` by sobolevn · Pull Request #99635 · python/cpython
I obviously can't answer 1. but with regard to 2.
As far as I can tell in order to hit this line the test would need to have a subclass of SourceLoader that has an implementation of get_data that will raise an EOFError or ImportError the first time it is called on a .py file but not on subsequent calls (and always succeed on .pyc files). It has to fail only the first time and only on .py files otherwise the method will fail without reaching the line or source_hash will already exist.
I'm not sure such a test would make sense, but I guess that's a separate discussion?
(I was writing my own SourceFileLoader and spent a lot of time staring at this code trying to figure out why this line wasn't failing).
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct to me. I don't think we need a test, even though it'd be nice.
Requesting a review from @benjaminp as the author of #4575.
A test would be good it have. (It seems you can get into this state by having a hash-based pyc where the source exists but not the bytecode cache?)
A test would be good it have. (It seems you can get into this state by having a hash-based pyc where the source exists but not the bytecode cache?)
I don't think so, the only line in the method that changes hash_based from False is L1110 which requires self.get_data(bytecode_path) returns successfully first.
| data = self.get_data(bytecode_path) | |
| except OSError: | |
| pass | |
| else: | |
| exc_details = { | |
| 'name': fullname, | |
| 'path': bytecode_path, | |
| } | |
| try: | |
| flags = _classify_pyc(data, fullname, exc_details) | |
| bytes_data = memoryview(data)[16:] | |
| hash_based = flags & 0b1 != 0 |
I think the test case is way too complex for such an obvious change:
- First we need to reach
hash_based = flags & 0b1 != 0, it depends ondata = self.get_data(bytecode_path)andflags = _classify_pyc(data, fullname, exc_details)been successful. So, we need either to mock it or to write (and then delete) some real bytecode. sys.dont_write_bytecodeshould beFalseandbytecode_pathshould be notNone. It depends on global interpreter state andbytecode_path = cache_from_source(source_path)been successfulsource_mtimeshould be notNone, it is set assource_mtime = int(st['mtime'])and depends onst = self.path_stats(source_path)- Finally this should be
True:if source_hash is None. It can only be achieved by skipping this condition:
if (_imp.check_hash_based_pycs != 'never' and (check_source or _imp.check_hash_based_pycs == 'always')):
I don't really think that such brittle unit tests will be worth it 😞
I think the test case is way too complex for such an obvious change:
...
4. Finally this should beTrue:if source_hash is None. It can only be achieved by skipping this condition:if (_imp.check_hash_based_pycs != 'never' and (check_source or _imp.check_hash_based_pycs == 'always')):I don't really think that such brittle unit tests will be worth it 😞
I agree the test case is too complicated but it's actually worse. If this condition is skipped the function returns early in the else block of the try/except so you still can't reach this line.
| else: | |
| _bootstrap._verbose_message('{} matches {}', bytecode_path, | |
| source_path) | |
| return _compile_bytecode(bytes_data, name=fullname, | |
| bytecode_path=bytecode_path, | |
| source_path=source_path) |
The only way for hash_based to be True and source_hash to be None and to reach the line later is if an ImportError or EOFError is raised (and so caught in the except block) after hash_based is set but before source_hash is set. I'm not sure if the 'real' methods have any chance of doing this.