◐ Shell
clean mode source ↗

gh-121650: Encode newlines in headers, and verify headers are sound by encukou · Pull Request #122233 · python/cpython

and others added 2 commits

July 24, 2024 15:30
This should fail for custom fold() implementations that aren't careful
about newlines.
Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

---

Credit for an earlier attempt:

Co-Authored-By: Bas Bloemsaat <bas@bloemsaat.org>

@encukou

I'm not touching other instances in this file, since this PR might
be backported to very old versions.

@encukou encukou marked this pull request as ready for review

July 29, 2024 13:18

@encukou

serhiy-storchaka

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

serhiy-storchaka

ambv pushed a commit to ambv/cpython that referenced this pull request

Aug 2, 2024
… are sound (pythonGH-122233)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

Yhg1s pushed a commit that referenced this pull request

Aug 6, 2024
…sound (GH-122233) (#122484)

gh-121650: Encode newlines in headers, and verify headers are sound (GH-122233)

GH-GH- Encode header parts that contain newlines

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

GH-GH- Verify that email headers are well-formed

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

Yhg1s pushed a commit that referenced this pull request

Aug 6, 2024
…sound (GH-122233) (#122599)

* gh-121650: Encode newlines in headers, and verify headers are sound (GH-122233)

- Encode header parts that contain newlines

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

- Verify that email headers are well-formed

This should fail for custom fold() implementations that aren't careful
about newlines.

Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
(cherry picked from commit 0976339)

* Document changes as made in 3.12.5

hroncok pushed a commit to fedora-python/cpython that referenced this pull request

Aug 6, 2024
…s are sound

pythongh-121650: Encode newlines in headers, and verify headers are sound (pythonGH-122233)

Encode header parts that contain newlines

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

Verify that email headers are well-formed

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

frenzymadness pushed a commit to frenzymadness/cpython that referenced this pull request

Aug 13, 2024
…s are sound (pythonGH-122233)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

frenzymadness pushed a commit to fedora-python/cpython that referenced this pull request

Aug 15, 2024
…s are sound (pythonGH-122233)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

stratakis pushed a commit to stratakis/cpython that referenced this pull request

Aug 15, 2024
…s are sound (pythonGH-122233)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

hrnciar added a commit to hrnciar/cpython that referenced this pull request

Aug 16, 2024
 headers are sound (pythonGH-122233)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

This patch also contains modified commit cherry picked from
c5bba85.

This commit was backported to simplify the backport of the other commit
fixing CVE. The only modification is a removal of one test case which
tests multiple changes in Python 3.7 and it wasn't working properly
with Python 3.6 where we backported only one change.

Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>

hrnciar added a commit to fedora-python/cpython that referenced this pull request

Aug 16, 2024
 headers are sound (pythonGH-122233)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

This patch also contains modified commit cherry picked from
c5bba85.

This commit was backported to simplify the backport of the other commit
fixing CVE. The only modification is a removal of one test case which
tests multiple changes in Python 3.7 and it wasn't working properly
with Python 3.6 where we backported only one change.

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>

hrnciar added a commit to fedora-python/cpython that referenced this pull request

Aug 20, 2024
 headers are sound (pythonGH-122233)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

This patch also contains modified commit cherry picked from
c5bba85.

This commit was backported to simplify the backport of the other commit
fixing CVE. The only modification is a removal of one test case which
tests multiple changes in Python 3.7 and it wasn't working properly
with Python 3.6 where we backported only one change.

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>

blhsing pushed a commit to blhsing/cpython that referenced this pull request

Aug 22, 2024
…ound (pythonGH-122233)

## Encode header parts that contain newlines

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.


## Verify that email headers are well-formed

This should fail for custom fold() implementations that aren't careful
about newlines.


Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

ambv added a commit that referenced this pull request

Sep 4, 2024
…ound (GH-122233) (#122611)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

ambv added a commit that referenced this pull request

Sep 4, 2024
…sound (GH-122233) (#122608)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

Verify that email headers are well-formed.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

ambv added a commit that referenced this pull request

Sep 4, 2024
…sound (GH-122233) (#122609)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

ambv added a commit that referenced this pull request

Sep 4, 2024
…ound (GH-122233) (#122610)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>

brainhoard-github pushed a commit to distro-core-curated-mirrors/poky-contrib that referenced this pull request

Sep 16, 2024

hrnciar added a commit to fedora-python/cpython that referenced this pull request

Apr 23, 2025
…s are sound (pythonGH-122233)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

This patch also contains modified commit cherry picked from
c5bba85.

This commit was backported to simplify the backport of the other commit
fixing CVE. The only modification is a removal of one test case which
tests multiple changes in Python 3.7 and it wasn't working properly
with Python 3.6 where we backported only one change.

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>

hroncok pushed a commit to fedora-python/cpython that referenced this pull request

Jul 4, 2025
…s are sound (pythonGH-122233)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

This patch also contains modified commit cherry picked from
c5bba85.

This commit was backported to simplify the backport of the other commit
fixing CVE. The only modification is a removal of one test case which
tests multiple changes in Python 3.7 and it wasn't working properly
with Python 3.6 where we backported only one change.

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>

frenzymadness pushed a commit to fedora-python/cpython that referenced this pull request

Aug 12, 2025
…s are sound (pythonGH-122233)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

This patch also contains modified commit cherry picked from
c5bba85.

This commit was backported to simplify the backport of the other commit
fixing CVE. The only modification is a removal of one test case which
tests multiple changes in Python 3.7 and it wasn't working properly
with Python 3.6 where we backported only one change.

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>

sethmlarson pushed a commit to sethmlarson/cpython that referenced this pull request

Jan 21, 2026
pythonGH-122233 added an implementation to `Generator`
to refuse to serialize (write) headers that
are unsafely folded or delimited.

This revision adds the same implementation
to `BytesGenerator`, so it gets the same safety protections
for unsafely folded or delimited headers

Co-authored-by: Denis Ledoux <5822488+beledouxdenis@users.noreply.github.com>
Co-authored-by: Petr Viktorin <302922+encukou@users.noreply.github.com>
Co-authored-by: Bas Bloemsaat <1586868+basbloemsaat@users.noreply.github.com>

sethmlarson pushed a commit to sethmlarson/cpython that referenced this pull request

Jan 21, 2026
pythonGH-122233 added an implementation to `Generator`
to refuse to serialize (write) headers that
are unsafely folded or delimited.

This revision adds the same implementation
to `BytesGenerator`, so it gets the same safety protections
for unsafely folded or delimited headers

Co-authored-by: Denis Ledoux <5822488+beledouxdenis@users.noreply.github.com>
Co-authored-by: Petr Viktorin <302922+encukou@users.noreply.github.com>
Co-authored-by: Bas Bloemsaat <1586868+basbloemsaat@users.noreply.github.com>

sethmlarson pushed a commit to sethmlarson/cpython that referenced this pull request

Jan 21, 2026
pythonGH-122233 added an implementation to `Generator`
to refuse to serialize (write) headers that
are unsafely folded or delimited.

This revision adds the same implementation
to `BytesGenerator`, so it gets the same safety protections
for unsafely folded or delimited headers

Co-authored-by: Denis Ledoux <5822488+beledouxdenis@users.noreply.github.com>
Co-authored-by: Petr Viktorin <302922+encukou@users.noreply.github.com>
Co-authored-by: Bas Bloemsaat <1586868+basbloemsaat@users.noreply.github.com>

hroncok pushed a commit to fedora-python/cpython that referenced this pull request

Feb 3, 2026
…s are sound (pythonGH-122233)

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.

This should fail for custom fold() implementations that aren't careful
about newlines.

(cherry picked from commit 0976339)

This patch also contains modified commit cherry picked from
c5bba85.

This commit was backported to simplify the backport of the other commit
fixing CVE. The only modification is a removal of one test case which
tests multiple changes in Python 3.7 and it wasn't working properly
with Python 3.6 where we backported only one change.

Co-authored-by: Petr Viktorin <encukou@gmail.com>
Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: bsiem <52461103+bsiem@users.noreply.github.com>