Issue 11771: hashlib object cannot be pickled
Created on 2011-04-05 11:49 by vstinner, last changed 2022-04-11 14:57 by admin. This issue is now closed.
| Messages (10) | |||
|---|---|---|---|
| msg133021 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2011-04-05 11:49 | |
$ ./python
Python 3.3a0 (default:76ed6a061ebe, Apr 5 2011, 12:25:00)
>>> import hashlib, pickle
>>> hash=hashlib.new('md5')
>>> pickle.dumps(hash)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
_pickle.PicklingError: Can't pickle <class '_hashlib.HASH'>: attribute lookup _hashlib.HASH failed
The problem is that _hashlib.HASH is not accessible at Python level. There is a C define to make it accessible, but it is disabled by default: "#if HASH_OBJ_CONSTRUCTOR". This test is as old as the _hashlib module (#1121611, 624918e1c1b2).
|
|||
| msg133022 - (view) | Author: STINNER Victor (vstinner) * ![]() |
Date: 2011-04-05 12:13 | |
Oh, I don't know if it is possible to serialize a OpenSSL hash object (EVP_MD_CTX)... |
|||
| msg133030 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-04-05 14:02 | |
Why on Earth would you want to serialize a hashlib object? It makes as much sense as serializing, say, a JSONEncoder. |
|||
| msg133192 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2011-04-07 06:08 | |
heh yeah. while all hash functions do have internal state and someone could conceivably want to store such a state (it basically amounts to queued up partial block of input data if any and the current starting IV) there are not consistent APIs to expose that and I really don't see why it'd be worth trying to find them. remember, hashlib doesn't have to be openssl. there are non openssl libtomcrypt based versions and someone nice should write a libnss based version someday. i'd mark this "won't fix." :) -Greg On Tue, Apr 5, 2011 at 7:02 AM, Antoine Pitrou <report@bugs.python.org> wrote: > > Antoine Pitrou <pitrou@free.fr> added the comment: > > Why on Earth would you want to serialize a hashlib object? > It makes as much sense as serializing, say, a JSONEncoder. > > ---------- > nosy: +gregory.p.smith, pitrou > > _______________________________________ > Python tracker <report@bugs.python.org> > <http://bugs.python.org/issue11771> > _______________________________________ > |
|||
| msg133193 - (view) | Author: Raymond Hettinger (rhettinger) * ![]() |
Date: 2011-04-07 06:17 | |
I also recommend closing this one. |
|||
| msg222036 - (view) | Author: Klaus Wolf (approximately) | Date: 2014-07-01 14:31 | |
Please reopen this bug. To answer the question: "Why on Earth would you want to serialize a hashlib object?" : multiprocessing.connection.ForkingPickler wants. I.e. if you want to parallelize your hash calculations, this will obstruct your efforts. |
|||
| msg222042 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2014-07-01 15:04 | |
Do you honestly have a situation where you need to share a computationally significant amount of hashing state only to want to finish the computation N different times with alternate computationally significant ending data that multiprocessing would actually help with where you cannot use threads? Hashlib releases the GIL during nontrivial hash computations. |
|||
| msg222044 - (view) | Author: Klaus Wolf (approximately) | Date: 2014-07-01 15:10 | |
You want to say: It doesn't work, but it is somehow intentional because you never used id, correct? |
|||
| msg222050 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2014-07-01 17:29 | |
Please be constructive. There is no way to implement generic pickling for hash objects that would work across all implementations. The underlying code implementing each function is free to store its internal state however it wants and does not provide an API to get at it or any standard representation of it. Sure, you could hack things up and allow a specific version and build of openssl's EVP hashes to dump their state and restore it for use in another process running that same specific version and build of openssl (as would likely be the case for multiprocessing use) just as you could for any other implementation of a hash function such as the builtin libtomcrypt versions. But this is not portable between compilations using different implementations of the hash algorithm. That is not what someone using pickle would ever expect. Public APIs to access the internal state of hash functions do not exist because it is not a common thing for people to do. hashlib isn't going to support this unless someone contributes a very solid patch with tests that handles all of the compatibility issues in a friendly maintainable manner. |
|||
| msg298350 - (view) | Author: Andrey Kislyuk (Andrey.Kislyuk) * | Date: 2017-07-14 12:14 | |
For anyone else looking for a solution to this, I wrote a library: https://github.com/kislyuk/rehash |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:57:15 | admin | set | github: 55980 |
| 2017-07-14 12:14:45 | Andrey.Kislyuk | set | nosy:
+ Andrey.Kislyuk messages: + msg298350 |
| 2014-07-01 17:29:48 | gregory.p.smith | set | type: enhancement messages: + msg222050 stage: resolved |
| 2014-07-01 15:10:40 | approximately | set | messages: + msg222044 |
| 2014-07-01 15:04:06 | gregory.p.smith | set | messages: + msg222042 |
| 2014-07-01 14:31:59 | approximately | set | nosy:
+ approximately messages: + msg222036 |
| 2011-04-07 08:17:54 | vstinner | set | status: open -> closed resolution: wont fix |
| 2011-04-07 06:17:47 | rhettinger | set | nosy:
+ rhettinger messages: + msg133193 |
| 2011-04-07 06:08:22 | gregory.p.smith | set | messages: + msg133192 |
| 2011-04-05 14:02:07 | pitrou | set | nosy:
+ gregory.p.smith, pitrou messages: + msg133030 |
| 2011-04-05 12:13:41 | vstinner | set | messages: + msg133022 |
| 2011-04-05 11:49:41 | vstinner | create | |
