gh-83938, gh-122476: Stop incorrectly RFC 2047 encoding non-ASCII email addresses#122540
gh-83938, gh-122476: Stop incorrectly RFC 2047 encoding non-ASCII email addresses#122540bitdancer merged 20 commits into
Conversation
d1f0bdc to
2e0696c
Compare
August 1, 2024 00:43
|
This is based on #81074 (comment):
Checking An alternative would be to introduce a new Also, I think |
Sorry, something went wrong.
2e0696c to
cbedf5d
Compare
August 1, 2024 01:11
Email generators had been incorrectly flattening non-ASCII email addresses to RFC 2047 encoded-word format, leaving them undeliverable. (RFC 2047 prohibits use of encoded-word in an addr-spec.) This change raises a ValueError when attempting to flatten an EmailMessage with a non-ASCII addr-spec and a policy with utf8=False. (Exception: If the non-ASCII address originated from parsing a message, it will be flattened as originally parsed, without error.) Non-ASCII email addresses are supported when using a policy with utf8=True (such as email.policy.SMTPUTF8) under RFCs 6531 and 6532. Non-ASCII email address domains (but not localparts) can also be used with non-SMTPUTF8 policies by encoding the domain as an IDNA A-label. (The email package does not perform this encoding, because it cannot know whether the caller wants IDNA 2003, IDNA 2008, or some other variant such as UTS python#46.)
cbedf5d to
faa4006
Compare
August 1, 2024 01:36
bitdancer
left a comment
There was a problem hiding this comment.
Your analysis is solid and the fix looks great. We'll need a follow on PR to have smtplib handle the new error, but that should be a trivial PR.
Sorry, something went wrong.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase And if you don't make the requested changes, you will be poked with soft cushions! |
Sorry, something went wrong.
- Incorporate PR review feedback - Improve docs
5d60c1c to
bd6845d
Compare
April 1, 2025 20:14
|
I have made the requested changes; please review again |
Sorry, something went wrong.
|
Thanks for making the requested changes! @bitdancer: please review the changes made to this pull request. |
Sorry, something went wrong.
bitdancer
left a comment
There was a problem hiding this comment.
Sorry it took me so long to get back to this.
Sorry, something went wrong.
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Sorry, something went wrong.
|
Looking at the code more closely, it is supposed to already be handing this case. The fact that it isn't indicates there's a bug there, and there were certainly missing tests. I will poke at this. |
Sorry, something went wrong.
The mime parameter folder doesn't make use of the encoding check done be the code that is now below it, it does its own. So it makes more sense to take that branch first. This will simplify subsequent changes.
This is a more complete fix, covering any syntax part where encoded words are not permitted, and the doc changes are adjusted accordingly. There is also no need for a new exception, since HeaderWriteError already exists. The fix itself is to use a separate code loop to fold parts that may not have encoded words, guaranteeing that we do not do incorrect encoding. This opens a door to simplifying the main folding loop, but that is a much bigger refactoring job better left for another time.
Behavior when folding in parts versus rendering on one line takes different code paths, so make sure both work.
|
I've crafted a more complete fix for the bug. If you could review it, @medmunds, that would be great. Since I switched to using the existing HeaderWriteError, I'm wondering if we need the versionchanged at all at this point, if anyone has input on that. |
Sorry, something went wrong.
This comment was marked as outdated.
This comment was marked as outdated.
Fix typo; bump versionchanged.
Documentation build overview
65 files changed ·
|
Sorry, something went wrong.
medmunds
left a comment
There was a problem hiding this comment.
@bitdancer this LGTM. I pushed a couple of fixes to the docs.
I'd be inclined to keep the versionchanged note. (Unless there's some style guide policy against that for bug fixes.) There seems to be a popular misconception—in many email libraries and systems—that any MIME header with non-ASCII chars can just be run through rfc2047. The versionchanged note makes it clear dropping rfc2047 here was a deliberate fix in Python's email package, and might help discourage future requests to add it back in.
Sorry, something went wrong.
|
Since this is a bug fix that isn't going to get backported because of backward compatibility concerns, we'll keep the versionchanged. We can always delete it if someone objects. |
Sorry, something went wrong.
|
It occurs to me that if it is worth a versionchanged there should also be a What's New entry. Do you want to write one? |
Sorry, something went wrong.
Something like this? [edited] Index: Doc/whatsnew/3.15.rst
<+>UTF-8
===================================================================
diff --git a/Doc/whatsnew/3.15.rst b/Doc/whatsnew/3.15.rst
--- a/Doc/whatsnew/3.15.rst (revision 71636d347e37db41a11a21419137fd7eb3423f74)
+++ b/Doc/whatsnew/3.15.rst (date 1777572899407)
@@ -914,6 +914,16 @@
(Contributed by Eric Froemling in :gh:`149085`.)
+email
+-----
+
+* Email generators now raise an error when an :class:`.EmailMessage` cannot be
+ accurately flattened due to a non-ASCII email address (mailbox) in an address
+ header. :attr:`.EmailPolicy.utf8` offers options for supporting Email Address
+ Internationalization (EAI).
+ (Contributed by R David Murray and Mike Edmunds in :gh:`122540`.)
+
+
functools
---------
[Edited: trying to capture the attention of users who might be affected and direct them to the docs that have complete details.] |
Sorry, something went wrong.
|
Yeah, that looks good. Though I'd say "The documentation for...offers" just to make it clear that it isn't the policy option itself that provides the support. |
Sorry, something went wrong.
|
Although I guess technically it does...maybe your wording is correct. |
Sorry, something went wrong.
d96ffc1
into
python:main
May 1, 2026
…122477) The email.headerregistry.Address constructor raised an error if addr_spec contained a non-ASCII character. (But it fully supports non-ASCII in the separate username and domain args.) This change removes the error for a non-ASCII addr_spec, as well as the Defect that triggered it. In the unicode era non-ascii is not a defect, though it is an error when an attempt is made to serialize it to ascii. The serialization issue was handled in #122540.
This PR fixes gh-83938 and fixes gh-122476, which have the same underlying issue.
Email generators had been incorrectly flattening non-ASCII email addresses to RFC 2047 encoded-word format, leaving them undeliverable. (RFC 2047 prohibits use of encoded-word in an addr-spec.) This change raises a ValueError when attempting to flatten an EmailMessage with a non-ASCII addr-spec and a policy with
utf8=False. (Exception: If the non-ASCII address originated from parsing a message, it will be flattened as originally parsed, without error.)Non-ASCII email addresses are supported when using a policy with
utf8=True(such as email.policy.SMTPUTF8) under RFCs 6531 and 6532.Non-ASCII email address domains (but not localparts) can also be used with non-SMTPUTF8 policies by encoding the domain as an IDNA A-label. (The email package does not perform this encoding, because it cannot know whether the caller wants IDNA 2003, IDNA 2008, or some other variant such as UTS-46.)
📚 Documentation preview 📚: https://cpython-previews--122540.org.readthedocs.build/