◐ Shell
clean mode source ↗

Message 259670 - Python tracker

My analysis of benchmarks.

Even using CPU isolation to run benchmarks, the results look unreliable for very short benchmarks like 3 ** 2.0: I don't think that fastint_alt can make the operation 16% slower since it doesn't touch this code, no?

Well... as expected, speedup is quite *small*: the largest difference is on "3 * 2" ran 100 times: 18% faster with fastint_alt. We are talking about 1.82 us => 1.49 us: delta of 330 ns. I expect a much larger difference is you compile a function to machine code using Cython or a JIT like Numba and PyPy. Remember that we are running *micro*-benchmarks, so we should not push overkill optimizations except if the speedup is really impressive.

It's quite obvious from the tables than fastint_alt.patch only optimize int (float is not optimized). If we choose to optimize float too, fastintfloat_alt.patch and fastint5.patch look to have the *same* speed.

I don't see any overhead on Decimal + Decimal with any patch: good.

--

Between fastintfloat_alt.patch and fastint5.patch, I prefer fastintfloat_alt.patch which is much easier to read, so probably much easier to debug. I hate huge macro when I have to debug code in gdb :-( I also like very much the idea of *reusing* existing functions, rather than duplicating code.

Even if Antoine doesn't seem interested by optimizations on float, I think that it's ok to add a few lines for this type, fastintfloat_alt.patch is not so complex. What do *you* think?

Why not optimizing a**b? It's a common operation, especially 2**k, no?