bpo-29410: Change the default hash algorithm to SipHash13. by methane · Pull Request #28752 · python/cpython
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The removal of SipHash-2-4 should go through regular deprecation cycle.
I recommend that you keep Py_HASH_SIPHASH24, add a new constant for Py_HASH_SIPHASH13, and add deprecation for Py_HASH_SIPHASH24.
When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.
add deprecation for
Py_HASH_SIPHASH24.
How to add deprecation to undocumented macro? Comment in header is enough?
You could add #warning Py_HASH_SIPHASH24 is deprecated to Python/pyhash.c. Or we could just leave SipHash-2-4 in there. The algorithm is also used for _Py_KeyedHash in import system.
| # seed 42, 'abc' | ||
| [-678966196, 573763426263223372, -820489388, -4282905804826039665], | ||
| ], | ||
| 'siphash13': [ |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
methane
marked this pull request as ready for review
methane
changed the title
bpo-29410: Use SipHash13 instead of SipHash24.
bpo-29410: Change the default hash algorithm to SipHash13.
Thanks for making the requested changes!
@tiran: please review the changes made to this pull request.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, almost there! I have two minor and one major suggestion for the PR.
| support :class:`typing.SupportsComplex` and :class:`typing.SupportsBytes` protocols. | ||
| (Contributed by Mark Dickinson and Dong-hee Na in :issue:`24234`.) | ||
|
|
||
| * ``siphash13`` is added as a new internal hash algorithm for strings, and the default |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small nit pick: Your change affects str, bytes, and a couple of other places like memoryview, regular expressions, and datetime objects.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would mention why we made it the default and that str and bytes are affected. Perhaps something like
siphash13is added as a new internal hashing algorithms. It's has similar security properties assiphash24but it is slightly faster for long inputs.str,bytes, and some other types now use it as default algorithm forhash()
When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.