Rustls integration improvements#7946

im-0

Primary goal of this PR is to fix #7891

Changes:

Do not call import socket on each send()/recv() when using rustls
Implement reading of at most one TLS record from socket. Previous algorithm didn't take into account that recv() may return less data than requested even for blocking sockets
Remove special handling of rustls "buffer full" errors. One of that checks is a root cause of a hang with 100% CPU core utilization. This PR also adds a regression test for this particular problem.
Replace own TlsConnection with rustls::Connection
Dead code removed
rustls errors are no longer presented as as OSError(0, "Success")
Infinite loop "detection" removed from rustls handshake code. TLS handshake cannot be infinite. Any infinite loop here is a serious bug in implementation and should be fixed properly. Also, this code triggered in some cases (very short reads) with misleading ssl_error.SSLWantReadError: The operation did not complete (read).
Deduplicated verify flags / record-size constants
Used slightly larger "max encrypted TLS record length" from OpenSSL headers
Socket wait fixes:
- socket_wait() called from TLS glue code now allows threads (vm.allow_threads())
- socket_wait() called from TLS glue code now properly handles EINTR on *nix
- select()/poll() error conditions are checked
- poll() is used on *nix so socket descriptor values are not limited to <FD_SETSIZE

Please note that this is no way near complete and current implementation probably still has a ton of bugs.

Comments

(coderabbit) Looking up send/recv on socket.socket hard-wires base-class dispatch and skips overrides on socket subclasses or instance-attached methods. That changes I/O behavior for wrapped custom sockets compared to resolving the methods from args.sock.

It is a current behavior and it is not that easy to change. I tried to naively replace it with calls to sock.send()/sock.recv() and it resulted in deadlocks.

Summary by CodeRabbit

Bug Fixes
- More reliable TLS handshakes and close-notify behavior; improved handling of short/partial socket reads and connect timeouts.
- Improved timeout and error mapping for socket and TLS I/O to reduce spurious failures.
Refactor
- Consolidated TLS connection model and standardized socket readiness/waiting for consistent cross-platform I/O behavior.
- Simplified SSL/TLS state and buffering for more predictable reads/writes.
Tests
- Added regression tests for fragmented/misaligned TLS record receives and short-read scenarios.

coderabbitai

📝 Walkthrough

Walkthrough

Unifies socket waiting with SockWaitKind/sock_wait and deadline handling, updates PySocket call sites, migrates SSL code to rustls::Connection with TLS-record-aware socket reads and cached socket methods, tightens handshake/SNI flows, updates error mapping, and adds two regression tests for short/fragmented recv behavior.

Changes

Socket Wait API and SSL/TLS Integration

Layer / File(s)	Summary
host_env: remove PollKind/poll_socket `crates/host_env/src/socket.rs`	Remove Unix-only `PollKind` and `poll_socket`; adjust Unix imports.
SockWait core `crates/stdlib/src/socket.rs`	Introduce `SockWaitKind`, `sock_wait` and `sock_wait_deadline` with platform implementations (Unix poll_fds, Windows select) and re-export them.
Socket operations migration `crates/stdlib/src/socket.rs`	Update `PySocket` helpers and all socket operations (`connect`, `accept`, `recv`, `send`, `sendall`, `sendmsg`, `recvmsg`, etc.) to use `SockWaitKind` and deadline-based waiting; simplify connect completion to rely on `sock.take_error()`.
SSL imports, constants, and context layout `crates/stdlib/src/ssl.rs`	Switch imports to `rustls::Connection` and `SockWaitKind`; extend OpenSSL-compat sizing/constants; rename session server-name storage to `_server_name`; remove legacy `PySSLContext` fields and add cached socket methods, TLS record header buffer, `sni_callback_processed`, and `TLS_RECORD_HEADER_SIZE`.
Socket-mode TLS integration `crates/stdlib/src/ssl.rs`	Cache Python `socket.send`/`socket.recv`, add `sock_recv_at_most_one_tls_record` to preserve record boundaries, and switch handshake/flush/shutdown waits to `sock_wait` variants.
ssl/compat.rs: rustls::Connection migration `crates/stdlib/src/ssl/compat.rs`	Remove internal `TlsConnection` enum and drive rustls via `rustls::Connection`; add record-aware recv helpers, rewrite handshake loop termination and SNI restart gating, update BIO draining sizing, reorganize `ssl_read_tls_records`, and adjust IO→Python error mapping.
Small cleanups and tests `crates/stdlib/src/ssl/error.rs`, `crates/stdlib/src/ssl/cert.rs`, `extra_tests/snippets/*`	Remove redundant allow attributes, adjust cert import source, update error mapping (UnexpectedEof→EOF; raw_os_error None→PySSLError), and add two regression tests exercising short/fragmented recv conditions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

RustPython/RustPython#7604: Adds the host_env PollKind/poll_socket infrastructure that this PR removes and replaces with SockWaitKind/sock_wait.

Suggested reviewers

youknowone
ShaharNaveh

Poem

🐰 From polls and selects I hopped away,
SockWaitKind now waits both night and day.
One record at once, tidy bytes in line,
Handshakes restart once, SNI’s now fine.
Hooray — a small hop toward cleaner time!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Rustls integration improvements' is vague and generic, using non-descriptive terms that don't convey the specific main objective of the changeset.	Consider using a more descriptive title that captures the primary fix, such as 'Fix rustls hang by handling short reads and socket waits properly' or similar.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	The pull request comprehensively addresses all coding requirements from issue `#7891`: fixes recv/read behavior with short read handling and one TLS record per socket read, removes problematic buffer-full error handling with regression tests, replaces custom TlsConnection with rustls::Connection, fixes handshake logic, improves socket wait semantics (thread allowance, EINTR handling, error checks, poll on Unix), and properly reports rustls errors.
Out of Scope Changes check	✅ Passed	All changes are directly related to rustls integration improvements and addressing issue `#7891`. No out-of-scope modifications detected.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

extra_tests/snippets/stdlib_urllib_https_misaligned_recv.py (2)
205-216: 💤 Low value

Document the -17 TLS overhead constant.

max(1, n - 17) derives plaintext sizes from the captured wire record sizes by subtracting an assumed 17 bytes of TLS 1.3 framing overhead (5-byte record header + 1-byte content type + 11-byte AEAD tag minus padding, depending on cipher). A short comment would make the intent obvious to future readers, since the magic number is otherwise opaque.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@extra_tests/snippets/stdlib_urllib_https_misaligned_recv.py` around lines 205
- 216, The code uses a magic constant "-17" when computing plaintext_sizes from
TLS_RECORD_BODY_SIZES (plaintext_sizes = [max(1, n - 17) ...]) which is unclear;
update the code by adding a brief inline comment next to plaintext_sizes (or
above its definition) that explains the origin of 17 (5‑byte TLS record header +
1‑byte content type + ~11‑byte AEAD tag for TLS 1.3 framing) and notes that this
is an approximation for the captured wire record sizes so future readers
understand the subtraction, referencing variables TLS_RECORD_BODY_SIZES and
plaintext_sizes.
175-193: 💤 Low value

Handshake/read loops don't detect peer closing the connection.

If conn.recv(65536) returns b"" (client disconnects mid-handshake or before sending \r\n\r\n), incoming.write(b"") is a no-op and the next tls.do_handshake()/tls.read() re-raises SSLWantReadError, looping forever until guard_timeout aborts the process. The guard_timeout safety net covers this, but the failure mode would be a 20-second hang plus a process abort rather than a clean test failure with a useful traceback. Consider breaking out on empty recv (e.g. raise/return) so a regression in the client surfaces directly.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@extra_tests/snippets/stdlib_urllib_https_misaligned_recv.py` around lines 175
- 193, The handshake and read loops using tls.do_handshake() and tls.read()
never detect a peer close because conn.recv(65536) returning b"" is ignored by
incoming.write(), causing infinite SSLWantReadError retries; modify both loops
(the block handling SSLWantReadError in the handshake loop and the
SSLWantReadError handler in the request/read loop) to check the result of
conn.recv(65536) and if it is b"" immediately break/raise an exception (or
return) so the function fails fast instead of looping until guard_timeout; keep
using drain_outgoing(outgoing, conn) as before but ensure you handle the
empty-recv case consistently to surface client disconnects cleanly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@extra_tests/snippets/stdlib_ssl_short_recv.py`:
- Line 2: Remove the unused top-level import "random" from the file (the import
statement referencing the random module in stdlib_ssl_short_recv.py) since it is
not referenced anywhere and causes a Flake8 F401; simply delete that import line
so the module no longer imports random unnecessarily.

In `@extra_tests/snippets/stdlib_urllib_https_misaligned_recv.py`:
- Around line 144-148: The timeout guard in function guard_timeout prints the
wrong test filename; update the stderr message to reference the current file
name "stdlib_urllib_https_misaligned_recv.py" instead of
"stdlib_urllib_https_short_recv.py" so the printed line in guard_timeout
correctly identifies the failing test when it times out.

---

Nitpick comments:
In `@extra_tests/snippets/stdlib_urllib_https_misaligned_recv.py`:
- Around line 205-216: The code uses a magic constant "-17" when computing
plaintext_sizes from TLS_RECORD_BODY_SIZES (plaintext_sizes = [max(1, n - 17)
...]) which is unclear; update the code by adding a brief inline comment next to
plaintext_sizes (or above its definition) that explains the origin of 17 (5‑byte
TLS record header + 1‑byte content type + ~11‑byte AEAD tag for TLS 1.3 framing)
and notes that this is an approximation for the captured wire record sizes so
future readers understand the subtraction, referencing variables
TLS_RECORD_BODY_SIZES and plaintext_sizes.
- Around line 175-193: The handshake and read loops using tls.do_handshake() and
tls.read() never detect a peer close because conn.recv(65536) returning b"" is
ignored by incoming.write(), causing infinite SSLWantReadError retries; modify
both loops (the block handling SSLWantReadError in the handshake loop and the
SSLWantReadError handler in the request/read loop) to check the result of
conn.recv(65536) and if it is b"" immediately break/raise an exception (or
return) so the function fails fast instead of looping until guard_timeout; keep
using drain_outgoing(outgoing, conn) as before but ensure you handle the
empty-recv case consistently to surface client disconnects cleanly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 36cac236-b9a6-408d-b2e0-d26dd9385242

📥 Commits

Reviewing files that changed from the base of the PR and between c845861 and 6dccca7.

📒 Files selected for processing (8)

crates/host_env/src/socket.rs
crates/stdlib/src/socket.rs
crates/stdlib/src/ssl.rs
crates/stdlib/src/ssl/cert.rs
crates/stdlib/src/ssl/compat.rs
crates/stdlib/src/ssl/error.rs
extra_tests/snippets/stdlib_ssl_short_recv.py
extra_tests/snippets/stdlib_urllib_https_misaligned_recv.py

💤 Files with no reviewable changes (2)

crates/stdlib/src/ssl/error.rs
crates/host_env/src/socket.rs

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/stdlib/src/ssl.rs (1)

1894-1895: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Cache the bound socket methods from the wrapped object.

Looking up send/recv on socket.socket hard-wires base-class dispatch and skips overrides on socket subclasses or instance-attached methods. That changes I/O behavior for wrapped custom sockets compared to resolving the methods from args.sock.

🐛 Proposed fix

-            let socket_mod = vm.import("socket", 0)?;
-            let socket_class = socket_mod.get_attr("socket", vm)?;
-
             // Create _SSLSocket instance
             let ssl_socket = PySSLSocket {
                 sock: args.sock.clone(),
-                sock_send_method: socket_class.get_attr("send", vm)?,
-                sock_recv_method: socket_class.get_attr("recv", vm)?,
+                sock_send_method: args.sock.get_attr("send", vm)?,
+                sock_recv_method: args.sock.get_attr("recv", vm)?,
                 tls_record_header_buf: vm
                     .ctx
                     .new_bytearray(Vec::with_capacity(TLS_RECORD_HEADER_SIZE))
                     .into(),

-            let socket_mod = vm.import("socket", 0)?;
-            let socket_class = socket_mod.get_attr("socket", vm)?;
-
             // Create _SSLSocket instance with BIO mode
             let ssl_socket = PySSLSocket {
                 sock: vm.ctx.none(), // No socket in BIO mode
-                sock_send_method: socket_class.get_attr("send", vm)?,
-                sock_recv_method: socket_class.get_attr("recv", vm)?,
+                sock_send_method: vm.ctx.none(),
+                sock_recv_method: vm.ctx.none(),
                 tls_record_header_buf: vm.ctx.none(),

-            self.sock_recv_method
-                .call((self.sock.clone(), vm.ctx.new_int(size)), vm)
+            self.sock_recv_method.call((vm.ctx.new_int(size),), vm)
...
-            self.sock_send_method
-                .call((self.sock.clone(), vm.ctx.new_bytes(data.to_vec())), vm)
+            self.sock_send_method
+                .call((vm.ctx.new_bytes(data.to_vec()),), vm)

Also applies to: 1975-1976

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/stdlib/src/ssl.rs` around lines 1894 - 1895, The code currently caches
send/recv using socket_class.get_attr which forces base-class lookup and ignores
overrides on the wrapped instance; change the lookup to retrieve the bound
methods from the actual wrapped object (args.sock.get_attr("send", vm) and
args.sock.get_attr("recv", vm)) so instance-attached methods and subclass
overrides are honored, and apply the same change for the other occurrence that
sets sock_send_method/sock_recv_method; preserve the existing error handling
semantics when get_attr returns an error.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/stdlib/src/ssl.rs`:
- Around line 1894-1895: The code currently caches send/recv using
socket_class.get_attr which forces base-class lookup and ignores overrides on
the wrapped instance; change the lookup to retrieve the bound methods from the
actual wrapped object (args.sock.get_attr("send", vm) and
args.sock.get_attr("recv", vm)) so instance-attached methods and subclass
overrides are honored, and apply the same change for the other occurrence that
sets sock_send_method/sock_recv_method; preserve the existing error handling
semantics when get_attr returns an error.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 629fa543-a407-4c56-9f0a-d36885bf549f

📥 Commits

Reviewing files that changed from the base of the PR and between 6dccca7 and 53692bf.

📒 Files selected for processing (5)

crates/stdlib/src/ssl.rs
crates/stdlib/src/ssl/cert.rs
crates/stdlib/src/ssl/compat.rs
extra_tests/snippets/stdlib_ssl_short_recv.py
extra_tests/snippets/stdlib_urllib_https_misaligned_recv.py

✅ Files skipped from review due to trivial changes (1)

crates/stdlib/src/ssl/cert.rs

Use method references cached during socket creation.

Previous algorithm didn't take into account that recv() may return less data than requested even for blocking sockets.

First of all, existing code does not really work and this leads to an infinite loop: RustPython#7891 Second, this should never happen when rustls used properly (wrt wants_read() and wants_write()) and thus all such errors are implementation bugs that must be properly fixed.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/stdlib/src/ssl.rs (1)

1842-1867: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Reuse validate_hostname() in _wrap_socket.

This path no longer enforces the 253-character limit that _wrap_bio() and set_server_hostname() apply, so oversized hostnames now fail later with a different error surface instead of the expected early ValueError.

Suggested fix

                 Some(hostname_str) => {
                     let hostname = hostname_str.as_str();
-
-                    // Validate hostname
-                    if hostname.is_empty() {
-                        return Err(vm.new_value_error("server_hostname cannot be an empty string"));
-                    }
-
-                    // Check if it starts with a dot
-                    if hostname.starts_with('.') {
-                        return Err(vm.new_value_error("server_hostname cannot start with a dot"));
-                    }
-
-                    // IP addresses are allowed
-                    // SNI will not be sent for IP addresses
-
-                    // Check for NULL bytes
-                    if hostname.contains('\0') {
-                        return Err(vm.new_type_error("embedded null character"));
-                    }
-
+                    validate_hostname(hostname, vm)?;
                     Some(hostname.to_string())
                 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/stdlib/src/ssl.rs` around lines 1842 - 1867, The hostname validation
in _wrap_socket currently duplicates checks but omits the 253-character limit
enforced by validate_hostname(); update the _wrap_socket path (the code handling
args.server_hostname in ssl.rs) to call the existing
validate_hostname(hostname_str) utility instead of reimplementing checks so the
same length, dot, empty, and null-byte validations are applied and the correct
ValueError is raised consistently (replace the manual checks in the
Some(hostname_str) branch with a call to validate_hostname and use its result to
produce the Some(String) or appropriate VM error).

🧹 Nitpick comments (1)

crates/stdlib/src/ssl/compat.rs (1)
756-758: ⚡ Quick win

Import VERIFY_X509_PARTIAL_CHAIN instead of redefining 0x80000.

This file already shares verification flags through _ssl; keeping a second literal copy here makes the two paths drift-prone the next time those constants move.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/stdlib/src/ssl/compat.rs` around lines 756 - 758, Remove the local
const VERIFY_X509_PARTIAL_CHAIN = 0x80000 and instead reference the shared
constant from the existing _ssl module; update the imports/usages so the
conditional uses _ssl::VERIFY_X509_PARTIAL_CHAIN (or whichever symbol is
exported from the _ssl shim) in the expression options.verify_flags &
VERIFY_X509_PARTIAL_CHAIN != 0 and keep the rest of the PartialChainVerifier
wrapping logic unchanged so the code relies on the single shared constant
instead of a duplicated literal.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@extra_tests/snippets/stdlib_ssl_short_recv.py`:
- Around line 55-56: Replace the overly broad except BaseException clause with
except Exception so system-level exceptions propagate: change "except
BaseException as exc:" to "except Exception as exc:" (or "except Exception:" if
you prefer not to reuse exc) in the try/except that appends to server_errors,
leaving the server_errors.append(exc) logic intact.

In `@extra_tests/snippets/stdlib_urllib_https_misaligned_recv.py`:
- Around line 183-196: When handling ssl.SSLWantReadError in the client loop
(locations using tls.read, drain_outgoing, conn.recv and incoming.write), check
the return value of conn.recv(...) and if it returns b"" call
incoming.write_eof() instead of incoming.write(b"") so EOF is propagated into
the MemoryBIO and TLS can progress; keep calling drain_outgoing(outgoing, conn)
as before. Also replace the server-side catch-all "except BaseException as exc:"
with "except Exception as exc:" to satisfy the BLE001 lint warning.
- Around line 221-222: Replace the overly broad except BaseException handler
with except Exception to avoid catching SystemExit/KeyboardInterrupt; locate the
try/except that appends to server_errors (the lines containing "except
BaseException as exc" and "server_errors.append(exc)") and change it to "except
Exception as exc" so only regular exceptions from the test thread are captured.

---

Outside diff comments:
In `@crates/stdlib/src/ssl.rs`:
- Around line 1842-1867: The hostname validation in _wrap_socket currently
duplicates checks but omits the 253-character limit enforced by
validate_hostname(); update the _wrap_socket path (the code handling
args.server_hostname in ssl.rs) to call the existing
validate_hostname(hostname_str) utility instead of reimplementing checks so the
same length, dot, empty, and null-byte validations are applied and the correct
ValueError is raised consistently (replace the manual checks in the
Some(hostname_str) branch with a call to validate_hostname and use its result to
produce the Some(String) or appropriate VM error).

---

Nitpick comments:
In `@crates/stdlib/src/ssl/compat.rs`:
- Around line 756-758: Remove the local const VERIFY_X509_PARTIAL_CHAIN =
0x80000 and instead reference the shared constant from the existing _ssl module;
update the imports/usages so the conditional uses
_ssl::VERIFY_X509_PARTIAL_CHAIN (or whichever symbol is exported from the _ssl
shim) in the expression options.verify_flags & VERIFY_X509_PARTIAL_CHAIN != 0
and keep the rest of the PartialChainVerifier wrapping logic unchanged so the
code relies on the single shared constant instead of a duplicated literal.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 66de22b5-fbbb-4a02-9b0b-4db2e1ac1cb6

📥 Commits

Reviewing files that changed from the base of the PR and between 53692bf and e97e483.

📒 Files selected for processing (8)

crates/host_env/src/socket.rs
crates/stdlib/src/socket.rs
crates/stdlib/src/ssl.rs
crates/stdlib/src/ssl/cert.rs
crates/stdlib/src/ssl/compat.rs
crates/stdlib/src/ssl/error.rs
extra_tests/snippets/stdlib_ssl_short_recv.py
extra_tests/snippets/stdlib_urllib_https_misaligned_recv.py

💤 Files with no reviewable changes (2)

crates/stdlib/src/ssl/error.rs
crates/host_env/src/socket.rs

✅ Files skipped from review due to trivial changes (1)

crates/stdlib/src/ssl/cert.rs

1) Ensure that socket_wait() called from TLS glue code allows threads 2) Ensure that socket_wait() called from TLS glue code properly handles EINTR on *nix 3) Ensure that select() or poll() error conditions are checked 4) Use poll() on *nix so socket descriptor values are not limited

TLS handshake cannot be infinite. Any infinite loop here is a serious bug in implementation and should be fixed properly. This code triggers in some cases (very short reads) with misleading `ssl_error.SSLWantReadError: The operation did not complete (read)`.

* Deduplicate verify flags / record-size constants * Larger "max encrypted TLS record length"

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/stdlib/src/ssl.rs (1)

4414-4433: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't treat EOF/reset as a successful close_notify.

This helper now returns true for empty reads and arbitrary socket errors, and it drops read_tls()/process_new_packets() failures. shutdown() breaks on that true, so a ragged EOF or TCP reset can complete unwrap() successfully instead of surfacing the protocol error.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/stdlib/src/ssl.rs` around lines 4414 - 4433, The helper
try_read_close_notify_socket currently treats an empty read and arbitrary socket
errors as a successful close_notify; instead, when
sock_recv_at_most_one_tls_record returns Ok(data) and data.is_empty() you must
NOT return true — return false (or propagate an error) so shutdown() does not
consider the TLS shutdown complete; likewise, do not ignore errors from
conn.read_tls() and conn.process_new_packets(): check their Results and if they
Err, propagate that error to the VM (or return false) instead of swallowing it;
finally, for Err(e) from sock_recv_at_most_one_tls_record, only return false for
blocking IO errors (is_blocking_io_error), but for other socket errors
propagate/raise the error rather than returning true. Ensure you update the
logic in try_read_close_notify_socket to examine conn.read_tls and
conn.process_new_packets results and to treat EOF/reset and non-blocking socket
errors as errors (not successful close_notify).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/stdlib/src/ssl/compat.rs`:
- Line 1081: The BIO-mode read in compat.rs currently calls
socket.sock_recv(SSL3_RT_MAX_PACKET_SIZE, vm) and assumes the entire chunk is
consumed, which can return multiple TLS records from MemoryBIO; update the code
that handles socket.sock_recv (and the surrounding BIO-path) to apply the same
record-boundary logic used for socket reads: parse the returned buffer for a
single TLS record, consume only the first record and preserve any unread suffix
bytes back into the MemoryBIO (or buffer) for future reads instead of discarding
them; ensure the fix references the same handling around SSL3_RT_MAX_PACKET_SIZE
and the MemoryBIO consumer so BIO callers maintain the one-record-per-read
invariant.

---

Outside diff comments:
In `@crates/stdlib/src/ssl.rs`:
- Around line 4414-4433: The helper try_read_close_notify_socket currently
treats an empty read and arbitrary socket errors as a successful close_notify;
instead, when sock_recv_at_most_one_tls_record returns Ok(data) and
data.is_empty() you must NOT return true — return false (or propagate an error)
so shutdown() does not consider the TLS shutdown complete; likewise, do not
ignore errors from conn.read_tls() and conn.process_new_packets(): check their
Results and if they Err, propagate that error to the VM (or return false)
instead of swallowing it; finally, for Err(e) from
sock_recv_at_most_one_tls_record, only return false for blocking IO errors
(is_blocking_io_error), but for other socket errors propagate/raise the error
rather than returning true. Ensure you update the logic in
try_read_close_notify_socket to examine conn.read_tls and
conn.process_new_packets results and to treat EOF/reset and non-blocking socket
errors as errors (not successful close_notify).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 6b063cdf-afc5-4a66-9280-e74cb3ff1b50

📥 Commits

Reviewing files that changed from the base of the PR and between e97e483 and 6c0e292.

📒 Files selected for processing (8)

crates/host_env/src/socket.rs
crates/stdlib/src/socket.rs
crates/stdlib/src/ssl.rs
crates/stdlib/src/ssl/cert.rs
crates/stdlib/src/ssl/compat.rs
crates/stdlib/src/ssl/error.rs
extra_tests/snippets/stdlib_ssl_short_recv.py
extra_tests/snippets/stdlib_urllib_https_misaligned_recv.py

💤 Files with no reviewable changes (2)

crates/stdlib/src/ssl/error.rs
crates/host_env/src/socket.rs

✅ Files skipped from review due to trivial changes (1)

crates/stdlib/src/ssl/cert.rs

ShaharNaveh

TYSM!

thank you for looking into this:)

youknowone

Thank you so much!

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

im-0 force-pushed the fix-rustls branch from 6dccca7 to 53692bf Compare May 21, 2026 12:14

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

Do not call import socket on each send()/recv() when using rustls …

ad3060c

Use method references cached during socket creation.

Implement reading of at most one TLS record from socket …

7fa757d

Previous algorithm didn't take into account that recv() may return less data than requested even for blocking sockets.

Replace own TlsConnection with rustls::Connection

023c754

im-0 force-pushed the fix-rustls branch from 53692bf to e97e483 Compare May 21, 2026 13:21

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

im-0 added 7 commits May 21, 2026 16:18

Remove dead code from rustls glue

e9e359c

Do not present rustls errors as OSError(0, "Success")

7a10b53

Add test for 1-byte max recv in TLS client

fbe1f80

Add regression test for RustPython#7891

6905963

Fix constants in rustls glue code …

6c0e292

* Deduplicate verify flags / record-size constants * Larger "max encrypted TLS record length"

im-0 force-pushed the fix-rustls branch from e97e483 to 6c0e292 Compare May 21, 2026 14:19

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

ShaharNaveh approved these changes May 21, 2026

View reviewed changes

youknowone approved these changes May 22, 2026

View reviewed changes

im-0 deleted the fix-rustls branch May 22, 2026 11:58

coderabbitai Bot mentioned this pull request Jun 1, 2026

[DO NOT MERGE] Almost complete rustls integration rewrite #8007

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rustls integration improvements#7946