fix: refactor negotiate loop to fix issue with async callback#5641
Merged
maddeleine merged 12 commits intoaws:mainfrom Dec 8, 2025
Merged
fix: refactor negotiate loop to fix issue with async callback#5641maddeleine merged 12 commits intoaws:mainfrom
maddeleine merged 12 commits intoaws:mainfrom
Conversation
0308566 to
5510ab6
Compare
maddeleine
commented
Nov 26, 2025
| if (s2n_connection_is_quic_enabled(conn)) { | ||
| record_type = TLS_HANDSHAKE; | ||
| uint8_t message_type = 0; | ||
| POSIX_GUARD_RESULT(s2n_quic_read_handshake_message(conn, &message_type)); |
Contributor
Author
There was a problem hiding this comment.
Note, something I discovered while doing this refactor is that the message_type that gets parsed in this function is never used? It's essentially overwritten later in s2n_read_full_handshake_message. So that's weird...
fac4d6f to
0ccf070
Compare
CarolYeh910
approved these changes
Dec 5, 2025
jmayclin
approved these changes
Dec 5, 2025
kaukabrizvi
pushed a commit
to kaukabrizvi/s2n-tls
that referenced
this pull request
Dec 8, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Goal
Fix an issue in our async callback code that breaks when multiple handshake messages are sent in the same record. The root cause of this is essentially that we've never needed to exit the negotiate loop with handshake data still waiting to be read. Now that we have new async callbacks that trigger in between reading handshake messages, we are exiting the loop in between handshake messages and the unread handshake messages are being wiped.
Why
Our async callback code that triggers on reading messages will cause the handshake to fail if multiple handshake messages messages are sent in the same record. There are three issues that you run into:
s2n_connection_apply_error_blindingis supposed to do nothing if we fail with an async blocked error. Instead it wipes the conn->in stuffer. This wipes the remaining TLS messages in the stuffer if there are more, leading to a handshake failure.handle_retry_statefunction also wipes the conn->in stuffer after the message handler succeeds. This means that again, the remaining TLS messages in the stuffer will be wiped, and the handshake will fail.How
Now the handle_retry_state function will process the remaining messages in the conn->in stuffer before going back to the normal negotiate loop.
Callouts
I think its probably not good that our code has duplicate handshake read/write logic for our async handling. Presumably there should be a way to get the async code to follow the normal negotiate read/write path. That would be a bigger refactor, and it's hard to justify that big of a code change to fix this bug. My fix does move us closer to more code sharing between these two paths though.
Testing
Includes a unit test to show that the situation which failed now succeeds. I removed the #[ignore] tag on the integ tests from #5638 which proves that my change fixes the issue.
Related
release summary: The TLS handshake now succeeds when the async cert callback is configured and peers sent multiple TLS handshake messages per record.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.