bpo-30102: Improve libssl performance on POWER8 for, e.g sha256#1181
bpo-30102: Improve libssl performance on POWER8 for, e.g sha256#1181gut wants to merge 7 commits into
Conversation
To correctly pick the best algorithm for the current architecture, libssl needs to have OPENSSL_config(NULL) called as described on: https://wiki.openssl.org/index.php/Libcrypto_API This short change lead to a speedup of 50% on POWER8 when using hashlib.sha256 digest functionality as it now uses a SIMD approach that was already existing but not used by cpython.
|
@gut, thanks for your PR! By analyzing the history of the files in this pull request, we identified @tiran, @loewis and @teoliphant to be potential reviewers. |
Sorry, something went wrong.
|
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. Thanks again to your contribution and we look forward to looking at it! |
Sorry, something went wrong.
|
Can anybody support me on this AppVeyor build fail? It looks like on windows build it's not linking because it doesn't find |
Sorry, something went wrong.
|
Apparently Perhaps @zware or @pfmoore can chime in on whether our current OpenSSL version on Windows is subject to this issue. Regardless, the aforelinked changeset shows how to replace |
Sorry, something went wrong.
|
@pitrou : I found a useful documentation talking about the deprecated interface: However I didn't have the same results when using |
Sorry, something went wrong.
Both solutions lead to a performance improvement on POWER8 when using sha digest
|
BTW, is there a deinitialization hook in order to call |
Sorry, something went wrong.
|
Another question: I thought it would fix windows build but now some other symbols are not being found at link time but I don't see it related to my change. Does anybody have a hint? |
Sorry, something went wrong.
|
The requirement to call |
Sorry, something went wrong.
Those seem to be certificate handling function symbols but the module being built is hashlib, why is hashlib using those functions? |
Sorry, something went wrong.
|
As for the Windows issue, perhaps you can try applying the following diff: If it doesn't work, perhaps @zware, @pfmoore or @zooba (our resident Windows experts) can give a hand to debug this. |
Sorry, something went wrong.
it worked! Thanks. |
Sorry, something went wrong.
Sorry, something went wrong.
|
@gut, do you know whether this issue only applies to POWER8, or does it also affect e.g. AES performance on x86? |
Sorry, something went wrong.
|
@pitrou This would be a good question to an OpenSSL expert. Well, I'll try some benchmarking on X86 and come back to you. Hang on... |
Sorry, something went wrong.
|
@pitrou I am more focused on hashing and I didn't notice different instructions or performance related to python or from openssl directly on X86. |
Sorry, something went wrong.
Sorry, something went wrong.
|
@gut, I am seeing around 760 MB/s here (Core i5-2500K), with and without the patch. |
Sorry, something went wrong.
|
@pitrou Hi. On my POWER8 I also noticed ~750 MB/s with and without patch. Interestingly the python version 3.5.3 found on Ubuntu 17.04 outputs ~860 MB/s. Maybe it's just my build but I didn't expect a performance drop. Thanks for the script |
Sorry, something went wrong.
|
Did you test your patch with OpenSSL 1.1.0? With 1.1.0, OpenSSL should automatically initialize all subsystems including the engines, https://www.openssl.org/docs/man1.1.0/crypto/OPENSSL_init_crypto.html |
Sorry, something went wrong.
|
Very interesting. I was testing with However besides init and deinit performed automatically as stated on: It also states that some initialization are only performed explicitly. E.g: (emphasis added by me) Which looks like what I'm trying to initialize on this PR. I'd suggest to update these initialization interfaces of OpenSSL 1.1.0 on another PR, what about it? It looks like there are other initialization aspects to be considered. |
Sorry, something went wrong.
|
I don't think that I don't trust |
Sorry, something went wrong.
|
BTW, I was testing the new openssl v1.1 and I ran a short example without any of this special initialization (engines) and it still used the optimized version just like I wanted! So don't need to init an engine with As for prior OpenSSL version, I'll check how can I avoid loading all engines and still having the optimized code for, e.g. sha256. |
Sorry, something went wrong.
|
Awesome! Thanks for you hard work! :) I'm sorry if my reviews sound harsh to you. Crypto and security is a mine field. OpenSSL is a pretty nasty and highly explosive mine field. As you have noticed, it is getting better with OpenSSL >= 1.1.0. |
Sorry, something went wrong.
Replace ENGINE_load_builtin_engines to OPENSSL_cpuid_setup. It leads to the performance gain on hashing on POWER8 platform. I don't like the explicit function declaration but I didn't find it on OpenSSL headers.
|
@tiran Don't worry. I see your reasons for being picky. I analysed OpenSSL closely and I noticed that only the platform detection is important for me. I'll quote the related code below:
One drawback though is that I don't like to declare the function |
Sorry, something went wrong.
|
Hi. I just got my CLA approved (check on http://bugs.python.org/issue30102 , there is an I hope we can continue reviewing this now. I didn't ping before as I wanted to sign the CLA first. Thanks in advance |
Sorry, something went wrong.
|
just solved the conflict: now the Is there anything else missing? @tiran : please comment my previous reply to your review. Is that enough? |
Sorry, something went wrong.
|
I guess we can close this PR as I just confirmed that the approved #3112 solves the issue I'm interested on. |
Sorry, something went wrong.
To correctly pick the best algorithm for the current architecture,
libssl needs to have OPENSSL_config(NULL) called as described on:
https://wiki.openssl.org/index.php/Libcrypto_API
This short change lead to a speedup of 50% on POWER8 when using
hashlib.sha256 digest functionality as it now uses a SIMD approach that
was already existing but not used by cpython.
https://bugs.python.org/issue30102