[3.15] gh-128110: Fix rfc2047 whitespace handling in email parser address headers (GH-130749) by miss-islington · Pull Request #149787 · python/cpython
…ess headers (pythonGH-130749) RFC 2047 Section 6.2 requires that "any 'linear-white-space' that separates a pair of adjacent 'encoded-word's is ignored." The modern header value parser correctly implements that for unstructured headers, but had missed a case in structured headers. This could cause a parsed address header to include extraneous spaces in a display-name. Switch to @bitdancer's fix from review feedback. Recharacterize space between ews as fws after parsing in get_phrase. RDM: This fix is dependent on the fact that "subsequent" atoms will never have leading whitespace because that's been consumed already. I don't think it's worth adding extra code for the possibility of leading whitespace because the parser won't produce it. It's a bit of parser fragility in the face of code changes, but I think that's a minor concern given the parser design (which is that it consumes whitespace greedily) (cherry picked from commit 7a4c6df) Co-authored-by: Mike Edmunds <medmunds@gmail.com> Co-authored-by: R David Murray <rdmurray@bitdance.com>
This was referenced
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters