Email MIME `base64` parser stops at the end of the first padding
Bug report
Bug description:
Hi Cpython Developers,
I was testing and comparing different email parsers, and found a parsing discrepancy that seems to be a problem.
MIME-Version: 1.0
Content-Type: application/zip
Content-Disposition: attachment; filename=archive.zip
Content-Transfer-Encoding: base64
UEsDBBQAAAAIAA==
emVkIGZpbGUgY29udGVudA==
With the python's email get_payload method, the return content would stopped at the first "==" as it seems to be the default behavior of base64.b64decode.
Meanwhile, peer implementations (e.g. apache.commons.mal (java), MimeKit (c#), PhpMimeMailParser (php)) will return the whole content.
Below is an running example in python.
import base64 import email """ Parsing the mime format """ request = """MIME-Version: 1.0 Content-Type: application/zip Content-Disposition: attachment; filename=archive.zip Content-Transfer-Encoding: base64 UEsDBBQAAAAIAA== emVkIGZpbGUgY29udGVudA== """ msg = email.message_from_string(request) print("Part content:", repr(msg.get_payload(decode=True))) print() """ Examples of base64 """ contents = [ "UEsDBBQAAAAIAA==\nemVkIGZpbGUgY29udGVudA==", "UEsDBBQAAAAIAA==emVkIGZpbGUgY29udGVudA==", "UEsDBBQAAAAIAA=emVkIGZpbGUgY29udGVudA==", "UEsDBBQAAAAIAAemVkIGZpbGUgY29udGVudA==", "UEsDBBQAAAAIAA==", "emVkIGZpbGUgY29udGVudA==" ] for content in contents: decoded_bytes = base64.b64decode(content) print(repr(content), " ->") print(" ", decoded_bytes)
Output:
Part content: b'PK\x03\x04\x14\x00\x00\x00\x08\x00'
'UEsDBBQAAAAIAA==\nemVkIGZpbGUgY29udGVudA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00'
'UEsDBBQAAAAIAA==emVkIGZpbGUgY29udGVudA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00'
'UEsDBBQAAAAIAA=emVkIGZpbGUgY29udGVudA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00\x07\xa6VB\x06f\x96\xc6R\x066\xf6\xe7FV\xe7@'
'UEsDBBQAAAAIAAemVkIGZpbGUgY29udGVudA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00\x07\xa6VB\x06f\x96\xc6R\x066\xf6\xe7FV\xe7@'
'UEsDBBQAAAAIAA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00'
'emVkIGZpbGUgY29udGVudA==' ->
b'zed file content'
Thank you,
Wei-Cheng
CPython versions tested on:
3.15
Operating systems tested on:
Linux