> Technically the patch LGTM. But we should find the cause of the regression in some benchmarks.
The benchmark is on Sandy Bridge (Core i5 2400) and I didn't use PGO build.
perf_event reported branch-miss rate increase at cpickle's save function.
I'll rerun benchmark with PGO build. I hope PGO is friendly with CPU branch
prediction, like L1/L2 cache.
Anyway, recent amd64 CPUs have more large branch history.
> And would be nice to extend the optimization to C functions.
> In any case this optimization is worth mentioning in What's New.
I'll do them.