Here's a slightly improved patch spurred by a parenthetical comment of Antoine's on the mailing list :-).
The only change is that it adds a check in subclass_dealloc to correct the reference counting in the weird case that someone converts a HEAPTYPE object into a non-HEAPTYPE object while it is being deallocated.
I should probably also mention that I ran the full test suite against the patched version and everything passed; and with -R 3:2 the only difference is a new reference leak in test_zipfile. I'm guessing this might be spurious, given that AFAIK this test shouldn't be touching __class__ assignment at all? But IDK.