Issue 4768: email.generator.Generator object bytes/str crash - b64encode() bug?
Created on 2008-12-29 17:21 by beazley, last changed 2022-04-11 14:56 by admin. This issue is now closed.
Messages (10)
msg78464 - (view)
Author: David M. Beazley (beazley)
Date: 2008-12-29 17:21
Date: 2009-01-02 01:21
Date: 2010-05-26 17:09
Date: 2010-06-04 16:17
Date: 2010-06-04 16:45
Date: 2013-06-28 19:05
The email.generator.Generator class does not work correctly message
objects created with binary data (MIMEImage, MIMEAudio, MIMEApplication,
etc.). For example:
>>> from email.mime.image import MIMEImage
>>> data = open("IMG.jpg","rb").read()
>>> m = MIMEImage(data,'jpeg')
>>> s = m.as_string()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/lib/python3.0/email/message.py", line 136, in as_string
g.flatten(self, unixfrom=unixfrom)
File "/tmp/lib/python3.0/email/generator.py", line 76, in flatten
self._write(msg)
File "/tmp/lib/python3.0/email/generator.py", line 101, in _write
self._dispatch(msg)
File "/tmp/lib/python3.0/email/generator.py", line 127, in _dispatch
meth(msg)
File "/tmp/lib/python3.0/email/generator.py", line 155, in
_handle_text
raise TypeError('string payload expected: %s' % type(payload))
TypeError: string payload expected: <class 'bytes'>
>>>
The source of the problem is rather complicated, but here is the gist of
it.
1. Classes such as MIMEAudio and MIMEImage accept raw binary data as
input. This data is going to be in the form of bytes.
2. These classes immediately encode the data using a base64 encoder.
This encoder uses the library function base64.b64encode().
3. base64.b64encode() takes a byte string as input and returns a byte
string as output. So, even after encoding, the payload of the message
is of type 'bytes'
4. When messages are generated, the method Generator._dispatch() is
used. It looks at the MIME main type and subtype and tries to dispatch
message processing to a handler method of the form
'_handle_type_subtype'. If it can't find such a handler, it defaults
to a method _writeBody(). For image and audio types, this is what
happens.
5. _writeBody() is an alias for _handle_text().
6. _handle_text() crashes because it's not expecting a payload of type
'bytes'.
Suggested fix:
I think the library function base64.b64encode() should return a string,
not bytes. The whole point of base64 encoding is to take binary data
and encode it into characters safe for inclusion in text strings.
Other fixes:
Modify the Generator class in email.generator to properly detect bytes
and use a different _handle function for it. For instance, maybe add a
_handle_binary() method.
msg78744 - (view)
Author: STINNER Victor (vstinner) *
Date: 2009-01-02 01:21
> I think the library function base64.b64encode() should return > a string, not bytes. Yes, in the email module, the payload is an unicode string, not a bytes string. We have to be able to concatenate headers (eg. "Content-Type: image/fish\nMIME-Version: 1.0\nContent-Transfer-Encoding: base64\n") and encoded data (eg. "R0lGO"). Attached patch implements this fix: encode_base64() returns str (and not bytes). The patchs fixes the unit tests and adds a new regression test for MIMEImage.as_string().msg103596 - (view) Author: Stac (stac) Date: 2010-04-19 13:59
Hello, This patch has never been commited. I tested today with the 3.1 branch (and checked in the lib code). Is there a better way to attach images in an email ? Thanks in advance for your help, Regards, Stacmsg105381 - (view) Author: (garazi111) Date: 2010-05-09 11:54
Hi,
I think the bug is also present in the function encode_quopri which should look like this :
def encode_quopri(msg):
"""Encode the message's payload in quoted-printable.
Also, add an appropriate Content-Transfer-Encoding header.
"""
orig = msg.get_payload()
encdata = _qencode(orig)
data = str(encdata, "ASCII")
msg.set_payload(data)
msg['Content-Transfer-Encoding'] = 'quoted-printable'
msg106546 - (view)
Author: STINNER Victor (vstinner) *
Date: 2010-05-26 17:09
I wrote a patch for base64.b64encode() to accept str (str is encoded to utf-8): patch attached to #4768. It should fix this issue, but we can add the tests of email_base64_bytes.patch.msg107063 - (view) Author: Forest Bond (forest_atq) * Date: 2010-06-04 15:01
Attaching patch from reported duplicate issue8896.msg107065 - (view) Author: Forest Bond (forest_atq) * Date: 2010-06-04 15:03
Note that my patch is roughly the same as the original posted by haypo.msg107073 - (view) Author: R. David Murray (r.david.murray) *
Date: 2010-06-04 16:17
Yes, but yours was better formatted, so I used it :) Thanks for the patch. Applied in r81685 to py3k, and r81686.msg107075 - (view) Author: R. David Murray (r.david.murray) *
Date: 2010-06-04 16:45
@garazi111: if you have an example where quopri fails, please open a new issue for it. I suspect you are right that there is a problem there.msg192010 - (view) Author: R. David Murray (r.david.murray) *
Date: 2013-06-28 19:05
For the record, encode_quopri was fixed in #14360.
History
Date
User
Action
Args
2022-04-11 14:56:43adminsetgithub: 49018
2013-06-28 19:05:06r.david.murraysetmessages:
+ msg192010
2011-07-06 04:59:21anacrolixsetnosy:
+ anacrolix
2010-06-26 02:10:51r.david.murraylinkissue9040 superseder 2010-06-04 16:45:30r.david.murraysetmessages: + msg107075 2010-06-04 16:17:20r.david.murraysetstatus: open -> closed
resolution: accepted -> fixed
messages: + msg107073
nosy: + forest_atq
messages: + msg107063 2010-06-04 14:57:37r.david.murraylinkissue8896 superseder 2010-05-26 17:09:16vstinnersetmessages: + msg106546 2010-05-10 19:06:51r.david.murraysetpriority: high -> critical 2010-05-09 11:54:44garazi111setnosy: + garazi111
messages: + msg105381
2010-04-23 01:55:23r.david.murraysettype: crash -> behavior
resolution: accepted
assignee: barry -> r.david.murray 2010-04-22 22:44:53pitrousetpriority: high
stage: patch review
versions: + Python 3.1, Python 3.2, - Python 3.0 2010-04-22 22:25:14eric.araujosetnosy: + eric.araujo
2010-04-19 19:19:02l0nwlfsetnosy: + l0nwlf
2010-04-19 19:01:14l0nwlfsetnosy: + r.david.murray
2010-04-19 13:59:19stacsetnosy: + stac
messages: + msg103596
2009-01-02 01:22:12vstinnersetfiles: + email_base64_bytes.patch
keywords: + patch 2009-01-02 01:21:59vstinnersetnosy: + vstinner
messages: + msg78744 2009-01-01 08:38:25brotchiesetnosy: + brotchie 2008-12-29 17:28:30benjamin.petersonsetassignee: barry
nosy: + barry 2008-12-29 17:21:40beazleycreate
2010-06-26 02:10:51r.david.murraylinkissue9040 superseder 2010-06-04 16:45:30r.david.murraysetmessages: + msg107075 2010-06-04 16:17:20r.david.murraysetstatus: open -> closed
resolution: accepted -> fixed
messages: + msg107073
stage: patch review -> resolved
2010-06-04 15:03:23forest_atqsetmessages: + msg107065 2010-06-04 15:01:47forest_atqsetfiles: + python-email-encoders-base64-str.patchnosy: + forest_atq
messages: + msg107063 2010-06-04 14:57:37r.david.murraylinkissue8896 superseder 2010-05-26 17:09:16vstinnersetmessages: + msg106546 2010-05-10 19:06:51r.david.murraysetpriority: high -> critical 2010-05-09 11:54:44garazi111setnosy: + garazi111
messages: + msg105381
2010-04-23 01:55:23r.david.murraysettype: crash -> behavior
resolution: accepted
assignee: barry -> r.david.murray 2010-04-22 22:44:53pitrousetpriority: high
stage: patch review
versions: + Python 3.1, Python 3.2, - Python 3.0 2010-04-22 22:25:14eric.araujosetnosy: + eric.araujo
2010-04-19 19:19:02l0nwlfsetnosy: + l0nwlf
2010-04-19 19:01:14l0nwlfsetnosy: + r.david.murray
2010-04-19 13:59:19stacsetnosy: + stac
messages: + msg103596
2009-01-02 01:22:12vstinnersetfiles: + email_base64_bytes.patch
keywords: + patch 2009-01-02 01:21:59vstinnersetnosy: + vstinner
messages: + msg78744 2009-01-01 08:38:25brotchiesetnosy: + brotchie 2008-12-29 17:28:30benjamin.petersonsetassignee: barry
nosy: + barry 2008-12-29 17:21:40beazleycreate