◐ Shell
clean mode source ↗

[bpo-28414] In SSL module, store server_hostname as an A-label by njsmith · Pull Request #3010 · python/cpython

@njsmith

Historically, our handling of international domain names (IDNs) in the
ssl module has been very broken. The flow went like:

1. User passes server_hostname= to the SSLSocket/SSLObject
   constructor. This gets normalized to an A-label by using the
   PyArg_Parse "et" mode: bytes objects get passed through
   unchanged (assumed to already be A-labels); str objects get run
   through .encode("idna") to convert them into A-labels.

2. newPySSLSocket takes this A-label, and for some reason decodes
   it *back* to a U-label, and stores that as the object's
   server_hostname attribute.

3. Later, this U-label server_hostname attribute gets passed to
   match_hostname, to compare against the hostname seen in the
   certificate. But certificates contain A-labels, and match_hostname
   expects to be passed an A-label, so this doesn't work at all.

This PR fixes the problem by removing the pointless decoding at step
2, so that internally we always use A-labels, which matches how
internet protocols are designed in general: A-labels are used
everywhere internally and on-the-wire, and U-labels are basically just
for user interfaces.

This also matches the general advice to handle encoding/decoding once
at the edges, though for backwards-compatibility we continue to use
'str' objects to store A-labels, even though they're now always
ASCII. Technically there is a minor compatibility break here: if a
user examines the .server_hostname attribute of an ssl-wrapped socket,
then previously they would have seen a U-label like "pythön.org", and
now they'll see an A-label like "xn--pythn-mua.org". But this only
affects non-ASCII domain names, which have never worked in the first
place, so it seems unlikely that anyone is relying on the old
behavior.

This PR also adds an end-to-end test for IDN hostname
validation. Previously there were no tests for this functionality.

Fixes bpo-28414.