Message 187608 - Python tracker
Thanks for the patch Thomas! Starting from your work I made an updated patch that fixes the bug, but at the same time the tests revealed another possible issue. In case of invalid character references, HTMLParser still calls handle_entityref instead of reporting them as 'data'. Not sure what the preferable behavior should be though, but anyway this is a separate issue.