◐ Shell
clean mode source ↗

[bpo-28414] Make all hostnames in SSL module IDN A-labels by tiran · Pull Request #5128 · python/cpython

@tiran tiran mentioned this pull request

Jan 7, 2018

alex

asvetlov

@tiran tiran mentioned this pull request

Jan 19, 2018
Historically, our handling of international domain names (IDNs) in the
ssl module has been very broken. The flow went like:

1. User passes server_hostname= to the SSLSocket/SSLObject
   constructor. This gets normalized to an A-label by using the
   PyArg_Parse "et" mode: bytes objects get passed through
   unchanged (assumed to already be A-labels); str objects get run
   through .encode("idna") to convert them into A-labels.

2. newPySSLSocket takes this A-label, and for some reason decodes
   it *back* to a U-label, and stores that as the object's
   server_hostname attribute.

3. Later, this U-label server_hostname attribute gets passed to
   match_hostname, to compare against the hostname seen in the
   certificate. But certificates contain A-labels, and match_hostname
   expects to be passed an A-label, so this doesn't work at all.

This PR fixes the problem by removing the pointless decoding at step
2, so that internally we always use A-labels, which matches how
internet protocols are designed in general: A-labels are used
everywhere internally and on-the-wire, and U-labels are basically just
for user interfaces.

This also matches the general advice to handle encoding/decoding once
at the edges, though for backwards-compatibility we continue to use
'str' objects to store A-labels, even though they're now always
ASCII. Technically there is a minor compatibility break here: if a
user examines the .server_hostname attribute of an ssl-wrapped socket,
then previously they would have seen a U-label like "pythön.org", and
now they'll see an A-label like "xn--pythn-mua.org". But this only
affects non-ASCII domain names, which have never worked in the first
place, so it seems unlikely that anyone is relying on the old
behavior.

This PR also adds an end-to-end test for IDN hostname
validation. Previously there were no tests for this functionality.

Fixes bpo-28414.
All test certs must be generated by CPython's own test helper.

Signed-off-by: Christian Heimes <christian@python.org>
Signed-off-by: Christian Heimes <christian@python.org>

@tiran tiran changed the title [bpo-28414][WIP] Make all hostnames in SSL module IDN A-labels [bpo-28414] Make all hostnames in SSL module IDN A-labels

Feb 22, 2018

njsmith

Drop extra code for PEP 543 future compatibility in sni callback

Use callable() and encode_hostname in shim for old SNI callback.

Signed-off-by: Christian Heimes <christian@python.org>

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request

Feb 24, 2018
)

Previously, the ssl module stored international domain names (IDNs)
as U-labels. This is problematic for a number of reasons -- for
example, it made it impossible for users to use a different version
of IDNA than the one built into Python.

After this change, we always convert to A-labels as soon as possible,
and use them for all internal processing. In particular, server_hostname
attribute is now an A-label, and on the server side there's a new
sni_callback that receives the SNI servername as an A-label rather than
a U-label.
(cherry picked from commit 11a1493)

Co-authored-by: Christian Heimes <christian@python.org>

njsmith pushed a commit that referenced this pull request

Feb 24, 2018
…H-5843)

Previously, the ssl module stored international domain names (IDNs)
as U-labels. This is problematic for a number of reasons -- for
example, it made it impossible for users to use a different version
of IDNA than the one built into Python.

After this change, we always convert to A-labels as soon as possible,
and use them for all internal processing. In particular, server_hostname
attribute is now an A-label, and on the server side there's a new
sni_callback that receives the SNI servername as an A-label rather than
a U-label.
(cherry picked from commit 11a1493)

Co-authored-by: Christian Heimes <christian@python.org>