bpo-43452: Microoptimizations to PyType_Lookup#24804
Conversation
|
I like this in general. I'm not convinced by (1) as it may lead to more collisions. I'd be happy to be proved wrong, if you have data. |
Sorry, something went wrong.
|
Can you try to write a micro-benchmark to show the speedup? |
Sorry, something went wrong.
Sorry, something went wrong.
I'll make an attempt at it and report back, but I think it'll be hard to get enough signal even w/ a microbenchmark. When I originally measured this against re-played Instagram traffic we saw a reduction in _PyType_Lookup from 1.7% of CPU usage to 1.6% of CPU usage. So it is a pretty small win overall, and may be dwarfed by all the other interpreter overhead in a benchmark. |
Sorry, something went wrong.
|
This is great work. I agree with @markshannon, I think that I am sold with (2) and (3) but I am quite not convinced by (1), but I think it should be fine because I expect quite a small amount of cache misses. |
Sorry, something went wrong.
One additional tweak here could be to do >> 2 on 32-bit platforms, although I'm not sure how much that'll really matter either. |
Sorry, something went wrong.
Objects are aligned by 16byte on 64bit and 8byte on 32bit platforms. So |
Sorry, something went wrong.
Since unicode objects is larger than minimum alignment, more than 4bits are not random. But lower bits of |
Sorry, something went wrong.
I forgot about we use only 12bits for hash. All bits are important. I posted benchmark result of |
Sorry, something went wrong.
|
About |
Sorry, something went wrong.
Should we update _Py_HashPointer()? |
Sorry, something went wrong.
Performance difference between |
Sorry, something went wrong.
|
I think this change is significant enough to have a news entry. |
Sorry, something went wrong.
gvanrossum
left a comment
There was a problem hiding this comment.
LGTM. But a news entry would be nice.
Sorry, something went wrong.
@vstinner I've added a small C micro-benchmark in https://bugs.python.org/issue43452 |
Sorry, something went wrong.
Thanks for proving that the optimization actually makes the code faster :-) |
Sorry, something went wrong.
gvanrossum
left a comment
There was a problem hiding this comment.
Okay, let’s land this. Thanks Dino for going the extra mile!
Sorry, something went wrong.
* master: (129 commits) bpo-43452: Micro-optimizations to PyType_Lookup (pythonGH-24804) bpo-43517: Fix false positive in detection of circular imports (python#24895) bpo-43494: Make some minor changes to lnotab notes (pythonGH-24861) Mention that code.co_lnotab is deprecated in what's new for 3.10. (python#24902) bpo-43244: Remove symtable.h header file (pythonGH-24910) bpo-43466: Add --with-openssl-rpath configure option (pythonGH-24820) Fix a typo in c-analyzer (pythonGH-24468) bpo-41561: Add workaround for Ubuntu's custom security level (pythonGH-24915) bpo-43521: Allow ast.unparse with empty sets and NaN (pythonGH-24897) bpo-43244: Remove the PyAST_Validate() function (pythonGH-24911) bpo-43541: Fix PyEval_EvalCodeEx() regression (pythonGH-24918) bpo-43244: Fix test_peg_generators on Windows (pythonGH-24913) bpo-39342: Expose X509_V_FLAG_ALLOW_PROXY_CERTS in ssl module (pythonGH-18011) bpo-43244: Fix test_peg_generator for PyAST_Validate() (pythonGH-24912) bpo-42128: Add 'missing :' syntax error message to match statements (pythonGH-24733) bpo-43244: Add pycore_ast.h header file (pythonGH-24908) bpo-43244: Rename pycore_ast.h to pycore_ast_state.h (pythonGH-24907) Remove unnecessary imports in the grammar parser (pythonGH-24904) bpo-35883: Py_DecodeLocale() escapes invalid Unicode characters (pythonGH-24843) Add PEP 626 to what's new in 3.10. (python#24892) ...
The common case going through _PyType_Lookup is to have a cache hit. There's some small tweaks which can make this a little cheaper:
the name field identity is used for a cache hit, and is kept alive by the cache. So there's no need to read the hash code of the name - instead the address can be used as the hash.
There's no need to check if the name is cachable on the lookup either, it probably is, and if it is, it'll be in the cache.
If we clear the version tag when invalidating a type then we don't actually need to check for a valid version tag bit.
https://bugs.python.org/issue43452