gh-142352: Fix start_tls() to transfer buffered data from StreamReader#142354
Conversation
picnixz
left a comment
There was a problem hiding this comment.
Has this PR been generated using an LLM? if so, indicate which places were so and read https://devguide.python.org/getting-started/generative-ai/.
| self.assertEqual(msg2, b"hello world 2!\n") | ||
|
|
||
| def _run_test_start_tls_behind_proxy(self, send_combined): | ||
| """Test start_tls() when TLS ClientHello arrives with PROXY header. |
There was a problem hiding this comment.
Remove docstrings from tests. It would be shown on failure. Use regular comments and shorten it. We don't want to restate the entire issue so simply link the issue's number as well.
There was a problem hiding this comment.
Understood, will update the tests accordingly.
| test_message = b"hello world\n" | ||
| expected_response = reverse_message(test_message) | ||
|
|
||
| class TCPProxyServer: |
There was a problem hiding this comment.
Do we really need this class? don't we already have some helpers elsewhere? The test also looks overly complex. Why not just patching existing TCPServer? (if possible; I don't know if there isn't some class that could be used as is with some minimal modifications)
|
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
|
Also, I think reading and checking if we're compliant with https://datatracker.ietf.org/doc/html/rfc2817 should be done (even if the RFC is only proposed and not in the standard tracks) or whatever RFC should be considered (I do think it's correct to fix this but I don't know if we should only send the possible hello for tls or generally write whatever was buffered) |
Noted in PR. |
Lib/asyncio/base_events.py
Outdated
| ssl_protocol._incoming.write(buffer) | ||
| buffer.clear() |
There was a problem hiding this comment.
I wonder if this could better belong @
cpython/Lib/asyncio/sslproto.py
Line 322 in b2ed3c3
Technically, there could be something like
- self._incoming = ssl.MemoryBIO()
+ self._incoming = ssl.MemoryBIO()
+ try:
+ underlying_raw_stream_buffer = app_protocol._stream_reader._buffer
+ except AttributeError:
+ pass
+ else:
+ self._incoming.write(underlying_raw_stream_buffer)
+ underlying_raw_stream_buffer.clear()But I don't know how other places that initialize sslproto.SSLProtocol() would be affected. So it's just an idea to check out and see if other instances should have the same buffer transfer logic (looks like it's used in two other modules).
RFC 2817 isn’t relevant here; it covers HTTP/1.1 Upgrade rather than |
b2ed3c3 to
86d80bf
Compare
|
I have made the requested changes; please review again |
|
Thanks for making the requested changes! @picnixz: please review the changes made to this pull request. |
86d80bf to
c2cd20b
Compare
I agree that explicitly passing the bytes to be forwarded is the cleanest model, since only the application knows its buffer semantics. A parameter like |
The new regression test (test_start_tls_buffered_data) shows the trigger is broader than “ClientHello in the same TCP segment as some application flag.” The issue is any buffered TLS data that reaches StreamReader before start_tls() is called. That can happen even without proxy‑protocol or segment coalescing. |
|
Please resolve the Github comments once they are addressed, otherwise it makes reviewing difficult. |
|
Thanks @kasimov-maxim for the PR, and @kumaraditya303 for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14. |
…a from StreamReader (pythonGH-142354) (cherry picked from commit 0598f4a8999b96409e0a2bf9c480afc76a876860) Co-authored-by: Maksym Kasimov <39828623+kasimov-maxim@users.noreply.github.com> Co-authored-by: Kumar Aditya <kumaraditya@python.org>
|
Sorry, @kasimov-maxim and @kumaraditya303, I could not cleanly backport this to |
|
GH-145363 is a backport of this pull request to the 3.14 branch. |
|
GH-145364 is a backport of this pull request to the 3.13 branch. |
…red data from StreamReader (pythonGH-142354) (cherry picked from commit 0598f4a) Co-authored-by: Maksym Kasimov <39828623+kasimov-maxim@users.noreply.github.com> Co-authored-by: Kumar Aditya <kumaraditya@python.org>
…a from StreamReader (pythonGH-142354) (cherry picked from commit 0598f4a) Co-authored-by: Maksym Kasimov <39828623+kasimov-maxim@users.noreply.github.com> Co-authored-by: Kumar Aditya <kumaraditya@python.org>
…a from StreamReader (python#142354) Co-authored-by: Kumar Aditya <kumaraditya@python.org>
…ta from StreamReader (GH-142354) (#145363) gh-142352: Fix `asyncio` `start_tls()` to transfer buffered data from StreamReader (GH-142354) (cherry picked from commit 0598f4a) Co-authored-by: Kumar Aditya <kumaraditya@python.org> Co-authored-by: Maksym Kasimov <39828623+kasimov-maxim@users.noreply.github.com>
Fixes #142352
Summary
When using
StreamWriter.start_tls()to upgrade a connection to TLS mid-stream, any data already buffered in theStreamReaderwas lost. This commonly occurs when implementing servers that support the PROXY protocol.Changes
_transfer_buffered_data_to_ssl()method toBaseEventLoopthat transfers buffered data fromStreamReader._buffertoSSLProtocol._incomingbefore the TLS handshake beginsGenerative AI usage
Some parts of this PR were assisted by a generative AI tool. Specifically:
All functional logic and final edits were reviewed, verified, and authored by
me, in accordance with the Python devguide guidelines.
Test plan
test_streams.pymake patchcheckpasses