bpo-36904: new function _PyStack_DictAsVector by jdemeyer · Pull Request #13308 · python/cpython
write the code or just see the code (written by me for example)?
For a relatively complex optimization? benchmark the code, before and after.
This two-step process is going to take more work: it requires to discuss two API designs and implementations instead of one (with the first one going to be thrown away anyway once we do the second).
But without the first, it'd be hard to argue that the tuple subclass actually brings any benefit. Tuple subclasses behave quite differently from exact tuples (they don't use freelists, for one).
On the other hand, adding the "refcount poking" part is pretty simple. So if you agree with doing just that manually (without changing _PyStack_UnpackDict at all), then I'm OK with that.
Definitely! All I ask for before the optimization is a correct baseline.
It might then very well turn out that the optimization is unnecessary.