◐ Shell
clean mode source ↗

gh-95494: Fix transport EOF handling in OpenSSL 3.0 by davidben · Pull Request #95495 · python/cpython

pythonGH-25309 enabled SSL_OP_IGNORE_UNEXPECTED_EOF by default, with a comment
that it restores OpenSSL 1.1.1 behavior, but this wasn't quite right.
That option causes OpenSSL to treat transport EOF as the same as
close_notify (i.e. SSL_ERROR_ZERO_RETURN), whereas Python actually has
distinct SSLEOFError and SSLZeroReturnError exceptions. (The latter is
usually mapped to a zero return from read.) In OpenSSL 1.1.1, the ssl
module would raise them for transport EOF and close_notify,
respectively. In OpenSSL 3.0, both act like close_notify.

Fix this by, instead, just detecting SSL_R_UNEXPECTED_EOF_WHILE_READING
and mapping that to the other exception type.

There doesn't seem to have been any unit test of this error, so fill in
the missing one. This had to be done with the BIO path because it's
actually slightly tricky to simulate a transport EOF with Python's fd
based APIs. (If you instruct the server to close the socket, it gets
confused, probably because the server's SSL object is still referencing
the now dead fd?)

@davidben

@davidben

@davidben

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request

Mar 24, 2023
…5495)

pythonGH-25309 enabled SSL_OP_IGNORE_UNEXPECTED_EOF by default, with a comment
that it restores OpenSSL 1.1.1 behavior, but this wasn't quite right.
That option causes OpenSSL to treat transport EOF as the same as
close_notify (i.e. SSL_ERROR_ZERO_RETURN), whereas Python actually has
distinct SSLEOFError and SSLZeroReturnError exceptions. (The latter is
usually mapped to a zero return from read.) In OpenSSL 1.1.1, the ssl
module would raise them for transport EOF and close_notify,
respectively. In OpenSSL 3.0, both act like close_notify.

Fix this by, instead, just detecting SSL_R_UNEXPECTED_EOF_WHILE_READING
and mapping that to the other exception type.

There doesn't seem to have been any unit test of this error, so fill in
the missing one. This had to be done with the BIO path because it's
actually slightly tricky to simulate a transport EOF with Python's fd
based APIs. (If you instruct the server to close the socket, it gets
confused, probably because the server's SSL object is still referencing
the now dead fd?)
(cherry picked from commit 420bbb7)

Co-authored-by: David Benjamin <davidben@google.com>

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request

Mar 24, 2023
…5495)

pythonGH-25309 enabled SSL_OP_IGNORE_UNEXPECTED_EOF by default, with a comment
that it restores OpenSSL 1.1.1 behavior, but this wasn't quite right.
That option causes OpenSSL to treat transport EOF as the same as
close_notify (i.e. SSL_ERROR_ZERO_RETURN), whereas Python actually has
distinct SSLEOFError and SSLZeroReturnError exceptions. (The latter is
usually mapped to a zero return from read.) In OpenSSL 1.1.1, the ssl
module would raise them for transport EOF and close_notify,
respectively. In OpenSSL 3.0, both act like close_notify.

Fix this by, instead, just detecting SSL_R_UNEXPECTED_EOF_WHILE_READING
and mapping that to the other exception type.

There doesn't seem to have been any unit test of this error, so fill in
the missing one. This had to be done with the BIO path because it's
actually slightly tricky to simulate a transport EOF with Python's fd
based APIs. (If you instruct the server to close the socket, it gets
confused, probably because the server's SSL object is still referencing
the now dead fd?)
(cherry picked from commit 420bbb7)

Co-authored-by: David Benjamin <davidben@google.com>

Fidget-Spinner pushed a commit to Fidget-Spinner/cpython that referenced this pull request

Mar 27, 2023
…5495)

pythonGH-25309 enabled SSL_OP_IGNORE_UNEXPECTED_EOF by default, with a comment
that it restores OpenSSL 1.1.1 behavior, but this wasn't quite right.
That option causes OpenSSL to treat transport EOF as the same as
close_notify (i.e. SSL_ERROR_ZERO_RETURN), whereas Python actually has
distinct SSLEOFError and SSLZeroReturnError exceptions. (The latter is
usually mapped to a zero return from read.) In OpenSSL 1.1.1, the ssl
module would raise them for transport EOF and close_notify,
respectively. In OpenSSL 3.0, both act like close_notify.

Fix this by, instead, just detecting SSL_R_UNEXPECTED_EOF_WHILE_READING
and mapping that to the other exception type.

There doesn't seem to have been any unit test of this error, so fill in
the missing one. This had to be done with the BIO path because it's
actually slightly tricky to simulate a transport EOF with Python's fd
based APIs. (If you instruct the server to close the socket, it gets
confused, probably because the server's SSL object is still referencing
the now dead fd?)

ambv pushed a commit that referenced this pull request

Mar 27, 2023
…#103006)

GH-25309 enabled SSL_OP_IGNORE_UNEXPECTED_EOF by default, with a comment
that it restores OpenSSL 1.1.1 behavior, but this wasn't quite right.
That option causes OpenSSL to treat transport EOF as the same as
close_notify (i.e. SSL_ERROR_ZERO_RETURN), whereas Python actually has
distinct SSLEOFError and SSLZeroReturnError exceptions. (The latter is
usually mapped to a zero return from read.) In OpenSSL 1.1.1, the ssl
module would raise them for transport EOF and close_notify,
respectively. In OpenSSL 3.0, both act like close_notify.

Fix this by, instead, just detecting SSL_R_UNEXPECTED_EOF_WHILE_READING
and mapping that to the other exception type.

There doesn't seem to have been any unit test of this error, so fill in
the missing one. This had to be done with the BIO path because it's
actually slightly tricky to simulate a transport EOF with Python's fd
based APIs. (If you instruct the server to close the socket, it gets
confused, probably because the server's SSL object is still referencing
the now dead fd?)
(cherry picked from commit 420bbb7)

Co-authored-by: David Benjamin <davidben@google.com>

ambv pushed a commit that referenced this pull request

Mar 27, 2023
…#103007)

GH-25309 enabled SSL_OP_IGNORE_UNEXPECTED_EOF by default, with a comment
that it restores OpenSSL 1.1.1 behavior, but this wasn't quite right.
That option causes OpenSSL to treat transport EOF as the same as
close_notify (i.e. SSL_ERROR_ZERO_RETURN), whereas Python actually has
distinct SSLEOFError and SSLZeroReturnError exceptions. (The latter is
usually mapped to a zero return from read.) In OpenSSL 1.1.1, the ssl
module would raise them for transport EOF and close_notify,
respectively. In OpenSSL 3.0, both act like close_notify.

Fix this by, instead, just detecting SSL_R_UNEXPECTED_EOF_WHILE_READING
and mapping that to the other exception type.

There doesn't seem to have been any unit test of this error, so fill in
the missing one. This had to be done with the BIO path because it's
actually slightly tricky to simulate a transport EOF with Python's fd
based APIs. (If you instruct the server to close the socket, it gets
confused, probably because the server's SSL object is still referencing
the now dead fd?)
(cherry picked from commit 420bbb7)

Co-authored-by: David Benjamin <davidben@google.com>

warsaw pushed a commit to warsaw/cpython that referenced this pull request

Apr 11, 2023
…5495)

pythonGH-25309 enabled SSL_OP_IGNORE_UNEXPECTED_EOF by default, with a comment
that it restores OpenSSL 1.1.1 behavior, but this wasn't quite right.
That option causes OpenSSL to treat transport EOF as the same as
close_notify (i.e. SSL_ERROR_ZERO_RETURN), whereas Python actually has
distinct SSLEOFError and SSLZeroReturnError exceptions. (The latter is
usually mapped to a zero return from read.) In OpenSSL 1.1.1, the ssl
module would raise them for transport EOF and close_notify,
respectively. In OpenSSL 3.0, both act like close_notify.

Fix this by, instead, just detecting SSL_R_UNEXPECTED_EOF_WHILE_READING
and mapping that to the other exception type.

There doesn't seem to have been any unit test of this error, so fill in
the missing one. This had to be done with the BIO path because it's
actually slightly tricky to simulate a transport EOF with Python's fd
based APIs. (If you instruct the server to close the socket, it gets
confused, probably because the server's SSL object is still referencing
the now dead fd?)

WillChilds-Klein added a commit to aws/aws-lc that referenced this pull request

Jan 19, 2024
# Notes

This change implements `SSL_MODE_AUTO_RETRY`, which was previously ignored. When this option (off by default) is set, `SSL_get_error`s behavior hews a little more closely to OpenSSL when processing a return code of 0 (i.e. empty I/O). With this option set, we allow error processing for empty reads if we're in a retryable state. With this option unset, we treat (most) 0-valued return codes as `SSL_ERROR_SYSCALL`, using it as a sort-of alias for EOF in the underlying transport. Internal calls to `SSL_get_error` (such as [those in `bio_ssl.cc`](https://github.com/aws/aws-lc/blob/main/ssl/bio_ssl.cc#L29)) then observe the  `SSL_ERROR_SYSCALL` and fail to process potential retries.

OpenSSL implements this mode differently, only checking its value in its [main read loop](https://github.com/openssl/openssl/blob/398011848468c7e8e481b295f7904afc30934217/ssl/record/rec_layer_s3.c#L992). Neither OpenSSL [1.1.1][6] nor [OpenSSL 3.x][10] return `SSL_ERROR_SYSCALL` early for empty reads, instead allowing them to be processed for retryable state. BoringSSL [appears to have diverged][7] from this behavior in 2015, but that commit was just a revert of a regression committed ~1 month prior. Their [previous implementation][18] closely resembles [that of OpenSSL 1.0.2][13], guarding retryable error processing with `i < 0` checks. `SSL_get_error` in [OpenSSL 1.1.1][12] and [OpenSSL 3.0][11], however, have no such guards. This change resembles the latter two. 

Why is it important that we process 0-valued return codes more fully? The recommended SSL IO functions (`SSL_read`, `SSL_write`, etc.) are, by contract, allowed to return negative numbers to indicate an error or non-success condition. However, their `size_t`-clean counterparts (`SSL_read_ex`, `SSL_write_ex`, etc.) must, by contract, only return 0 or 1. This means that passing the return code from the various `_ex` functions to `SSL_get_error` loses error reporting information, resulting in inconsistent handling of otherwise identical error cases between the two function families.

Another solution to the `_ex` functions' loss of error reporting information would be to change the contract of those functions to allow negative return values, directly returning the return code from internal calls to `SSL_read` or `SSL_write`. I decided not to take this approach as it would break callers who do not account for negative return values in their error handling.

# Relevant Links

- [OpenSSL 1.0.2 documentation for `SSL_get_error`][3]
- [OpenSSL 1.1.1 documentation for `SSL_get_error`][4]
- [OpenSSL 3.0 documentation for `SSL_get_error`][5]
- [OpenSSL 1.0.2 implementation for `SSL_get_error`][13]
- [OpenSSL 1.1.1 implementation for `SSL_get_error`][12]
- [OpenSSL 3.0 implementation for `SSL_get_error`][11]
- [OpenSSL master (3.2) implementation for `SSL_get_error`][10]
- [OpenSSL 1.1.1 commit dropping `ret < 0` guards for hint processing][6]
- [Unresolved OpenSSL issue for distinguishing EOF from failure in `BIO_read_ex`][1]
- [BoringSSL commit introducing `SSL_ERROR_SYSCALL` for `ret == 0` case][7]
- [BoringSSL Issue 503: Stop preserving arbitrary BIO error return values][2]
- [BoringSSL Issue 24: Sanitize SSL and BIO return codes.][15]
- [OpenSSL 3.0 commit returning `SSL_ERROR_SSL`, add reason to err stack on EOF][14]
- [OpenSSL 3.0 commit introducing `SSL_OP_IGNORE_UNEXPECTED_EOF`  option][8]
- [CPython commit from @davidben removing CPython's use of `SSL_OP_IGNORE_UNEXPECTED_EOF`][9]
- [CPython commit switching from `SSL_read` and `SSL_write` to `SSL_read_ex` and `SSL_write_ex`][16]
- [OpenSSL 1.1.1 documentation for `SSL_MODE_AUTO_RETRY`][17]



[1]: openssl/openssl#8208
[2]: https://bugs.chromium.org/p/boringssl/issues/detail?id=503
[3]: https://www.openssl.org/docs/man1.0.2/man3/SSL_get_error.html
[4]: https://www.openssl.org/docs/man1.1.1/man3/SSL_get_error.html
[5]: https://www.openssl.org/docs/man3.0/man3/SSL_get_error.html
[6]: openssl/openssl@8051ab2
[7]: google/boringssl@9a38e92
[8]: openssl/openssl@09b90e0
[9]: python/cpython#95495
[10]: https://github.com/openssl/openssl/blob/master/ssl/ssl_lib.c#L4601
[11]: https://github.com/openssl/openssl/blob/openssl-3.0/ssl/ssl_lib.c#L3839
[12]: https://github.com/openssl/openssl/blob/OpenSSL_1_1_1-stable/ssl/ssl_lib.c#L3617
[13]: https://github.com/openssl/openssl/blob/OpenSSL_1_0_2-stable/ssl/ssl_lib.c#L2709
[14]: openssl/openssl@d924dbf
[15]: https://bugs.chromium.org/p/boringssl/issues/detail?id=24
[16]: python/cpython@89d1550
[17]: https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_set_mode.html
[18]: google/boringssl@fcf2583#diff-38e1eecb6013c2fe8bf194e5020ce1afd6e31c371d7849e041464ad822fe745bL2271

WillChilds-Klein added a commit to WillChilds-Klein/aws-lc that referenced this pull request

Jan 19, 2024
This change implements `SSL_MODE_AUTO_RETRY`, which was previously ignored. When this option (off by default) is set, `SSL_get_error`s behavior hews a little more closely to OpenSSL when processing a return code of 0 (i.e. empty I/O). With this option set, we allow error processing for empty reads if we're in a retryable state. With this option unset, we treat (most) 0-valued return codes as `SSL_ERROR_SYSCALL`, using it as a sort-of alias for EOF in the underlying transport. Internal calls to `SSL_get_error` (such as [those in `bio_ssl.cc`](https://github.com/aws/aws-lc/blob/main/ssl/bio_ssl.cc#L29)) then observe the  `SSL_ERROR_SYSCALL` and fail to process potential retries.

OpenSSL implements this mode differently, only checking its value in its [main read loop](https://github.com/openssl/openssl/blob/398011848468c7e8e481b295f7904afc30934217/ssl/record/rec_layer_s3.c#L992). Neither OpenSSL [1.1.1][6] nor [OpenSSL 3.x][10] return `SSL_ERROR_SYSCALL` early for empty reads, instead allowing them to be processed for retryable state. BoringSSL [appears to have diverged][7] from this behavior in 2015, but that commit was just a revert of a regression committed ~1 month prior. Their [previous implementation][18] closely resembles [that of OpenSSL 1.0.2][13], guarding retryable error processing with `i < 0` checks. `SSL_get_error` in [OpenSSL 1.1.1][12] and [OpenSSL 3.0][11], however, have no such guards. This change resembles the latter two.

Why is it important that we process 0-valued return codes more fully? The recommended SSL IO functions (`SSL_read`, `SSL_write`, etc.) are, by contract, allowed to return negative numbers to indicate an error or non-success condition. However, their `size_t`-clean counterparts (`SSL_read_ex`, `SSL_write_ex`, etc.) must, by contract, only return 0 or 1. This means that passing the return code from the various `_ex` functions to `SSL_get_error` loses error reporting information, resulting in inconsistent handling of otherwise identical error cases between the two function families.

Another solution to the `_ex` functions' loss of error reporting information would be to change the contract of those functions to allow negative return values, directly returning the return code from internal calls to `SSL_read` or `SSL_write`. I decided not to take this approach as it would break callers who do not account for negative return values in their error handling.

- [OpenSSL 1.0.2 documentation for `SSL_get_error`][3]
- [OpenSSL 1.1.1 documentation for `SSL_get_error`][4]
- [OpenSSL 3.0 documentation for `SSL_get_error`][5]
- [OpenSSL 1.0.2 implementation for `SSL_get_error`][13]
- [OpenSSL 1.1.1 implementation for `SSL_get_error`][12]
- [OpenSSL 3.0 implementation for `SSL_get_error`][11]
- [OpenSSL master (3.2) implementation for `SSL_get_error`][10]
- [OpenSSL 1.1.1 commit dropping `ret < 0` guards for hint processing][6]
- [Unresolved OpenSSL issue for distinguishing EOF from failure in `BIO_read_ex`][1]
- [BoringSSL commit introducing `SSL_ERROR_SYSCALL` for `ret == 0` case][7]
- [BoringSSL Issue 503: Stop preserving arbitrary BIO error return values][2]
- [BoringSSL Issue 24: Sanitize SSL and BIO return codes.][15]
- [OpenSSL 3.0 commit returning `SSL_ERROR_SSL`, add reason to err stack on EOF][14]
- [OpenSSL 3.0 commit introducing `SSL_OP_IGNORE_UNEXPECTED_EOF`  option][8]
- [CPython commit from @davidben removing CPython's use of `SSL_OP_IGNORE_UNEXPECTED_EOF`][9]
- [CPython commit switching from `SSL_read` and `SSL_write` to `SSL_read_ex` and `SSL_write_ex`][16]
- [OpenSSL 1.1.1 documentation for `SSL_MODE_AUTO_RETRY`][17]

[1]: openssl/openssl#8208
[2]: https://bugs.chromium.org/p/boringssl/issues/detail?id=503
[3]: https://www.openssl.org/docs/man1.0.2/man3/SSL_get_error.html
[4]: https://www.openssl.org/docs/man1.1.1/man3/SSL_get_error.html
[5]: https://www.openssl.org/docs/man3.0/man3/SSL_get_error.html
[6]: openssl/openssl@8051ab2
[7]: google/boringssl@9a38e92
[8]: openssl/openssl@09b90e0
[9]: python/cpython#95495
[10]: https://github.com/openssl/openssl/blob/master/ssl/ssl_lib.c#L4601
[11]: https://github.com/openssl/openssl/blob/openssl-3.0/ssl/ssl_lib.c#L3839
[12]: https://github.com/openssl/openssl/blob/OpenSSL_1_1_1-stable/ssl/ssl_lib.c#L3617
[13]: https://github.com/openssl/openssl/blob/OpenSSL_1_0_2-stable/ssl/ssl_lib.c#L2709
[14]: openssl/openssl@d924dbf
[15]: https://bugs.chromium.org/p/boringssl/issues/detail?id=24
[16]: python/cpython@89d1550
[17]: https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_set_mode.html
[18]: google/boringssl@fcf2583#diff-38e1eecb6013c2fe8bf194e5020ce1afd6e31c371d7849e041464ad822fe745bL2271

dougch pushed a commit to dougch/aws-lc that referenced this pull request

Jan 30, 2024
# Notes

This change implements `SSL_MODE_AUTO_RETRY`, which was previously ignored. When this option (off by default) is set, `SSL_get_error`s behavior hews a little more closely to OpenSSL when processing a return code of 0 (i.e. empty I/O). With this option set, we allow error processing for empty reads if we're in a retryable state. With this option unset, we treat (most) 0-valued return codes as `SSL_ERROR_SYSCALL`, using it as a sort-of alias for EOF in the underlying transport. Internal calls to `SSL_get_error` (such as [those in `bio_ssl.cc`](https://github.com/aws/aws-lc/blob/main/ssl/bio_ssl.cc#L29)) then observe the  `SSL_ERROR_SYSCALL` and fail to process potential retries.

OpenSSL implements this mode differently, only checking its value in its [main read loop](https://github.com/openssl/openssl/blob/398011848468c7e8e481b295f7904afc30934217/ssl/record/rec_layer_s3.c#L992). Neither OpenSSL [1.1.1][6] nor [OpenSSL 3.x][10] return `SSL_ERROR_SYSCALL` early for empty reads, instead allowing them to be processed for retryable state. BoringSSL [appears to have diverged][7] from this behavior in 2015, but that commit was just a revert of a regression committed ~1 month prior. Their [previous implementation][18] closely resembles [that of OpenSSL 1.0.2][13], guarding retryable error processing with `i < 0` checks. `SSL_get_error` in [OpenSSL 1.1.1][12] and [OpenSSL 3.0][11], however, have no such guards. This change resembles the latter two. 

Why is it important that we process 0-valued return codes more fully? The recommended SSL IO functions (`SSL_read`, `SSL_write`, etc.) are, by contract, allowed to return negative numbers to indicate an error or non-success condition. However, their `size_t`-clean counterparts (`SSL_read_ex`, `SSL_write_ex`, etc.) must, by contract, only return 0 or 1. This means that passing the return code from the various `_ex` functions to `SSL_get_error` loses error reporting information, resulting in inconsistent handling of otherwise identical error cases between the two function families.

Another solution to the `_ex` functions' loss of error reporting information would be to change the contract of those functions to allow negative return values, directly returning the return code from internal calls to `SSL_read` or `SSL_write`. I decided not to take this approach as it would break callers who do not account for negative return values in their error handling.

# Relevant Links

- [OpenSSL 1.0.2 documentation for `SSL_get_error`][3]
- [OpenSSL 1.1.1 documentation for `SSL_get_error`][4]
- [OpenSSL 3.0 documentation for `SSL_get_error`][5]
- [OpenSSL 1.0.2 implementation for `SSL_get_error`][13]
- [OpenSSL 1.1.1 implementation for `SSL_get_error`][12]
- [OpenSSL 3.0 implementation for `SSL_get_error`][11]
- [OpenSSL master (3.2) implementation for `SSL_get_error`][10]
- [OpenSSL 1.1.1 commit dropping `ret < 0` guards for hint processing][6]
- [Unresolved OpenSSL issue for distinguishing EOF from failure in `BIO_read_ex`][1]
- [BoringSSL commit introducing `SSL_ERROR_SYSCALL` for `ret == 0` case][7]
- [BoringSSL Issue 503: Stop preserving arbitrary BIO error return values][2]
- [BoringSSL Issue 24: Sanitize SSL and BIO return codes.][15]
- [OpenSSL 3.0 commit returning `SSL_ERROR_SSL`, add reason to err stack on EOF][14]
- [OpenSSL 3.0 commit introducing `SSL_OP_IGNORE_UNEXPECTED_EOF`  option][8]
- [CPython commit from @davidben removing CPython's use of `SSL_OP_IGNORE_UNEXPECTED_EOF`][9]
- [CPython commit switching from `SSL_read` and `SSL_write` to `SSL_read_ex` and `SSL_write_ex`][16]
- [OpenSSL 1.1.1 documentation for `SSL_MODE_AUTO_RETRY`][17]



[1]: openssl/openssl#8208
[2]: https://bugs.chromium.org/p/boringssl/issues/detail?id=503
[3]: https://www.openssl.org/docs/man1.0.2/man3/SSL_get_error.html
[4]: https://www.openssl.org/docs/man1.1.1/man3/SSL_get_error.html
[5]: https://www.openssl.org/docs/man3.0/man3/SSL_get_error.html
[6]: openssl/openssl@8051ab2
[7]: google/boringssl@9a38e92
[8]: openssl/openssl@09b90e0
[9]: python/cpython#95495
[10]: https://github.com/openssl/openssl/blob/master/ssl/ssl_lib.c#L4601
[11]: https://github.com/openssl/openssl/blob/openssl-3.0/ssl/ssl_lib.c#L3839
[12]: https://github.com/openssl/openssl/blob/OpenSSL_1_1_1-stable/ssl/ssl_lib.c#L3617
[13]: https://github.com/openssl/openssl/blob/OpenSSL_1_0_2-stable/ssl/ssl_lib.c#L2709
[14]: openssl/openssl@d924dbf
[15]: https://bugs.chromium.org/p/boringssl/issues/detail?id=24
[16]: python/cpython@89d1550
[17]: https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_set_mode.html
[18]: google/boringssl@fcf2583#diff-38e1eecb6013c2fe8bf194e5020ce1afd6e31c371d7849e041464ad822fe745bL2271