bpo-29882: Add an efficient popcount method for integers#771
Conversation
|
@niklasf, thanks for your PR! By analyzing the history of the files in this pull request, we identified @tim-one, @birkenfeld and @ncoghlan to be potential reviewers. |
Sorry, something went wrong.
|
Some compilers (gcc) provide hardware optimized |
Sorry, something went wrong.
|
@ilevkivskyi: Yes, using hardware popcnt makes this ever so slightly faster. Previously I have submitted #594, but so far there is no consensus on wether or how to use these intrinsics. So I think it's best to consider this seperately. Edit: I should also mention that there is a faster software implementation, using complete lookup tables for 16bit integers. This one is commonly used in chess engines if hardware popcnt is not available, but seems like a rather large overhead for a general purpose interpreter. |
Sorry, something went wrong.
methane
left a comment
There was a problem hiding this comment.
Code looks OK.
But it seems no consensus about adding this method or not.
Sorry, something went wrong.
1c6e9fa to
b6e46da
Compare
June 2, 2019 09:21
b6e46da to
74e3ea6
Compare
June 2, 2019 09:33
|
Thanks for reviewing. All done, in case we do decide to add this. |
Sorry, something went wrong.
|
@niklasf I'm interested in getting this updated and merged, for Python 3.10. May I merge master into this branch and make the appropriate fixes (changing the "whats new" entry from 3.8 to 3.10)? Or would you prefer to do that yourself? |
Sorry, something went wrong.
|
@mdickinson Of course, feel free to modify as needed. |
Sorry, something went wrong.
mdickinson
left a comment
There was a problem hiding this comment.
LGTM. (Admittedly that's not entirely an independent judgement at this point, but I haven't changed any of the core code or tests.)
Sorry, something went wrong.
vstinner
left a comment
There was a problem hiding this comment.
LGTM.
LGTM: the implementation is correct. I'm not strongly excited by having this method since I never needed it, but it seems like some other people need it. So well, it looks reasonable to add it. Also, in general, I completely trust @mdickinson :-)
Sorry, something went wrong.
mdickinson
left a comment
There was a problem hiding this comment.
The current implementation of popcount_digit isn't quite as portable as it should be.
The fix should be trivial. I'll look into it.
Sorry, something went wrong.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Sorry, something went wrong.
mdickinson
left a comment
There was a problem hiding this comment.
Re-approving after fixing my own portability objection.
Sorry, something went wrong.
|
@mdickinson: Please replace |
Sorry, something went wrong.
|
OMG, I was merging the PR (I was editing the commit message) while @mdickinson merged it :-D Thanks @mdickinson ;-) |
Sorry, something went wrong.
|
Thanks for getting this over the finish line! |
Sorry, something went wrong.
|
PR created in 2017! It has an identifier lower than 1000 rather than the latest PR got the identifier 20517! Thanks @niklasf! |
Sorry, something went wrong.
|
I wrote PR #20518 which adds a _Py_popcount32() static inline function based on popcount_digit(). |
Sorry, something went wrong.
* 'master' of github.com:python/cpython: (497 commits) bpo-40061: Fix a possible refleak in _asynciomodule.c (pythonGH-19748) bpo-40798: Generate a different message for already removed elements (pythonGH-20483) closes bpo-29017: Update the bindings for Qt information with PySide2 (pythonGH-20149) bpo-39885: Make IDLE context menu cut and copy work again (pythonGH-18951) bpo-29882: Add an efficient popcount method for integers (python#771) Further de-linting of zoneinfo module (python#20499) bpo-40780: Fix failure of _Py_dg_dtoa to remove trailing zeros (pythonGH-20435) Indicate that abs() method accept argument that implement __abs__(), just like call() method in the docs (pythonGH-20509) bpo-39040: Fix parsing of email mime headers with whitespace between encoded-words. (pythongh-17620) bpo-40784: Fix sqlite3 deterministic test (pythonGH-20448) bpo-30064: Properly skip unstable loop.sock_connect() racing test (pythonGH-20494) Note the output ordering of combinatoric functions (pythonGH-19732) bpo-40474: Updated coverage.yml to better report coverage stats (python#19851) bpo-40806: Clarify that itertools.product immediately consumes its inpt (pythonGH-20492) bpo-1294959: Try to clarify the meaning of platlibdir (pythonGH-20332) bpo-37878: PyThreadState_DeleteCurrent() was not removed (pythonGH-20489) bpo-40777: Initialize PyDateTime_IsoCalendarDateType.tp_base at run-time (pythonGH-20493) bpo-40755: Add missing multiset operations to Counter() (pythonGH-20339) bpo-25920: Remove socket.getaddrinfo() lock on macOS (pythonGH-20177) bpo-40275: Fix test.support.threading_helper (pythonGH-20488) ...
codeimport random
import timeit
lens = range(1, 10000)
maxs = [2**i for i in lens]
# array with random numbers in [0, maxs[i]-1]
randos = [max + random.randrange(max) for max in maxs]
# time bin().count('1') vs bit_count() for each random number
bin_times = [timeit.timeit(lambda: bin(rando).count('1'), number=100) for rando in randos]
bit_count_times = [timeit.timeit(lambda: rando.bit_count(), number=100) for rando in randos]
# create plot
from matplotlib import pyplot as plt
plt.loglog(lens, bin_times, label='bin().count(\'1\')')
plt.loglog(lens, bit_count_times, label='bit_count()')
plt.legend()
plt.show()
|
Sorry, something went wrong.
|
@HannesGitH Why the comment? Is there some action that needs to be taken? |
Sorry, something went wrong.
nope, all fine, sorry just wanted to add the speedup graph for anyone who stumbles upon this and is curious |
Sorry, something went wrong.

https://bugs.python.org/issue29882
https://bugs.python.org/issue29882