Fix: ASIO Stream handling of close_notify in handshake#3801
Merged
reneme merged 1 commit intorandombit:masterfrom Dec 6, 2023
Merged
Conversation
344ba4f to
6782373
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
reneme
reviewed
Nov 13, 2023
14 tasks
reneme
added a commit
to Rohde-Schwarz/botan
that referenced
this pull request
Nov 14, 2023
A complete revamp of the state management in the ASIO stream wrapper to properly disentangle TLS state transitions even when receiving coalesced TLS records. Those can result in multiple state changes of Botan's underlying TLS implementation. For extensive technical details and a higher-level problem description, please see randombit#3801. Co-Authored-By: René Meusel <rene.meusel@rohde-schwarz.com>
a9042a3 to
83184a5
Compare
Collaborator
|
@randombit Unfortunately, this escalated into an essential re-write of the ASIO stream wrapper's low level operations. The rationale and implementation strategy is outlined in detail in the pull request's description. @FAlbertDev and I think, that the operations are now quite a bit clearer as well. Also, since the patch adds plenty of inline documentation for anyone to understand the implementation more easily in the future. This finally fixes the nightly-build that is now broken for more than two weeks. >.< But we're now fairly confident that the ASIO stream is stable with this patch. After all, TLS Anvil throws a lot of corner cases on it. |
reneme
added a commit
to Rohde-Schwarz/botan
that referenced
this pull request
Nov 14, 2023
A complete revamp of the state management in the ASIO stream wrapper to properly disentangle TLS state transitions even when receiving coalesced TLS records. Those can result in multiple state changes of Botan's underlying TLS implementation. For extensive technical details and a higher-level problem description, please see randombit#3801. Co-Authored-By: René Meusel <rene.meusel@rohde-schwarz.com>
83184a5 to
200e3f3
Compare
reneme
added a commit
to Rohde-Schwarz/botan
that referenced
this pull request
Nov 14, 2023
A complete revamp of the state management in the ASIO stream wrapper to properly disentangle TLS state transitions even when receiving coalesced TLS records. Those can result in multiple state changes of Botan's underlying TLS implementation. For extensive technical details and a higher-level problem description, please see randombit#3801. Co-Authored-By: René Meusel <rene.meusel@rohde-schwarz.com>
200e3f3 to
4a5d432
Compare
reneme
approved these changes
Nov 14, 2023
FAlbertDev
commented
Nov 15, 2023
Collaborator
Author
FAlbertDev
left a comment
There was a problem hiding this comment.
Good job with the documentation!
A complete revamp of the state management in the ASIO stream wrapper to properly disentangle TLS state transitions even when receiving coalesced TLS records. Those can result in multiple state changes of Botan's underlying TLS implementation. For extensive technical details and a higher-level problem description, please see randombit#3801. Co-Authored-By: René Meusel <rene.meusel@rohde-schwarz.com>
4a5d432 to
a8e28de
Compare
lieser
approved these changes
Dec 6, 2023
Thiesius
pushed a commit
to Roboauto/botan
that referenced
this pull request
May 20, 2024
Use furo Sphinx theme Add OpenSSL 1.1 to Botan 3.x migration guide FIX: handling of TLS protocol errors in the ASIO wrapper A complete revamp of the state management in the ASIO stream wrapper to properly disentangle TLS state transitions even when receiving coalesced TLS records. Those can result in multiple state changes of Botan's underlying TLS implementation. For extensive technical details and a higher-level problem description, please see randombit#3801. Co-Authored-By: René Meusel <rene.meusel@rohde-schwarz.com> Add check to SM2_PrivateKey::check_key() that private key x is in ]0, q-1[ ./botan tls_client_hello can deal with TLS 1.3 Add Private_Key::remaining_operations() (randombit#3706) FIX: clang-tidy warnings Extend documentation for encryption and signing parameters note disabled modules in build/build_config.json Dynamically select a disabled module for the CMake test Before we just hard-coded the TPM module. Though, this causes issues when running the test in a build configuration where TPM is actually available. Like in the BSI build configurations. See also: sehlen-bsi/botan-docs#159 Mention CMake configuration in docs ./configure.py --without-cmake-config disables cmake integration Add sig algos and key agreements to bench.py Co-authored-by: René Meusel <github@renemeusel.de> Update Doxygen landing page Adding references to XOF (base class), PK_KEM_En/Decryptor, Kuznyechik, BLAKE2b (as a MAC), KMAC, KEMs (Kyber, RSA), Dilithium, SPHINCS+ and SHAKE (as a XOF) FIX: don't assume that 'tpm' is generally not available This is a follow-up of 971d7d5 Cleanup after merge master
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Pull Request Dependencies
This is a follow-up to #3795 to close #3778.
TL;DR
If multiple TLS records are coalesced into a single byte blob on the wire, Botan's TLS implementation may experience multiple state changes in a single call to
TLS::Channel::received_data(). The ASIO stream must disentangle those state changes for the downstream application to work correctly. We do this by buffering TLS state changes internally and reporting them to the ASIO application step-by-step.Review Strategy Suggestion
Reviewers should re-iterate the described implementation strategy for the synchronous implementations of
handshake()andread_some()first. The asynchronous implementations follow the same principle but are much more involved due to the callback-handle nature of theAsyncOperationidiom.TLS States (in the ASIO stream wrapper)
Below is a sketch of the states a TLS connection will pass through in the eyes of the TLS ASIO stream wrapper. State transitions are signaled to the downstream application by the behavior of the stream's (asynchronouns) operations. Some state changes must be triggered by the downstream application itself. See the table for an overview of the state transitions.
Link to Excalidraw.com: https://excalidraw.com/#json=_CLKTOpdlnUCnVu0I6_mr,oDT9HxzM9l5BAKTt8F1uqw
handshake()orasync_handshake().(async_)handshake()returns without an error code or exception. Downstream application is now expected to callboost::asio::(async_)write(),::(async_)read()orstream.(async_)shutdown()on the stream.stream.(async_)shutdown()or::(async_)read()returned "End of File" (after peer sent a CloseNotify). In the latter case the downstream application is expected to callstream.(async_)shutdown().stream.(async_)shutdown()and it returns without an error code or exception.(async_)handshake()returns with "End of File" (the peer has cancelled the handshake with a CloseNotify)Technical Details of the Problem
The ASIO stream wrapper interacts with the underlying TLS implementation via the commonly known APIs: most notably
TLS::Channel::received_data()for incoming data from the peer,TLS::Channel::send()for outgoing application data. TLS state changes, decrypted application data from the peer and encrypted data meant to be sent to the peer are communicated via overloads of theTLS::Callbacksclass. This happens when the downstream code callsTLS::Channel::received_data()orTLS::Channel::send()(among others).The core of the problem: A single flight of TLS records received from the peer might cause more than one TLS stream state change (as outlined above). In that case, a single call to
::received_data()(inside the ASIO stream) may make the underlying TLS implementation state "skip" intermediate steps on first glance. The ASIO wrapper needs to be prepared for this and manage those intemediate state changes.Below is an example, where the ASIO stream acts as a TLS server and a client finishes the handshake, sends some application data and closes the connection in a single flight of bytes (depicted by the yellow boxes). We assume that a single
::(async_)read_some()on the server's socket pulls the entire flight with three coalesced TLS records into the stream's read buffer. The TLS records are: (1) the client's Finished message, (2) some encrypted application data, and (3) a CloseNotify alert. This is perfectly legal behavior of the client.Link to Excalidraw.com: https://excalidraw.com/#json=oGeAoDoG3RULU52W4fkAi,4Kh9hGhDhO7YmKB8lO3vlw
Note how the ASIO stream wrapper has to disentangle the state changes caused by the single flight of the client's Finished, Application Data and CloseNotify into multiple ASIO function calls and return codes (depicted by the green activation boxes).
The underlying bugfix: Before this rework, the ASIO stream wrapper had processed said coalesced flight without leaving the "Handshake" state. Nevertheless, the
(async_)handshake()came back clean but the "End of File" error (that should have been triggered by the peer's CloseNotify) was dropped. A typical downstream application would then simply call(async_)read_some()on the stream to receive data. But this blocked forever -- as the peer had already declared their end of stream.Implementation Strategy of the State Disentanglement
The TLS implementation provides enough predicate APIs to check for any relevant intermediate states (e.g.
is_handshake_finished()). Additionally, the depicted state machine is a directed acyclic graph; therefore, aTLS::Channelthat returnstrueonis_handshake_finished()will always returntruefor it. Similarly, once aCloseNotify(or any other TLS alert) was seen, there's no going back to a previous state.Given those invariants, the (asynchronous) operations in the ASIO stream wrapper detect intermediate states simply by checking the relevant predicate methods in the correct order. That way, each operation reacts on the intermediate state changes relevant for them, only.
For instance,
(async_)handshake()returns successfully as soon asis_handshake_finished()istrue; even if the implementation has already read and decrypted some application data whose TLS record was coalesced with the peer's Finished message. A consecutive call to(async_)read_some()will find said application data in the output buffer and will push that to the downstream application first; even if the implementation has already received the peer's CloseNotify. New data is read from the peer only once neither pending decrypted application data is available nor a protocol error was detected.The same is true for protocol errors: Stream operations deliberately prioritize their "ultimate goal" over checking for protocol errors. I.e. a handshake operation will return successfully once
is_handshake_complete()is true. Even though some coalesced record following the Finished messages might already have resulted in some fatal protocol error. Under the "directed acyclic graph" assumption, the fatal error must have occurred AFTER the successful handshake and is therefore safe to ignore in the handshake operation. Assuming that the application will invoke additional stream operations after the successful handshake, any "retained" fatal protocol alert will surface in due time.(Maybe) Future Work
We still see two issues with the above-described approach:
Once we receive a CloseNotify from the peer, the TLS implementation will, in turn, place a CloseNotify in the output buffer. Hence, a downstream application won't be able to send any more application data through this stream. This is the specified behavior of TLS 1.2, but TLS 1.3 would allow this. We may be able to fix that limitation in a future pull request.
This feature allows sending and receiving "early data" when resuming a session, even before the handshake is finished. This complicates the above-described state machine. Also, the current API of the stream wrapper doesn't account for this feature. Supporting "early data" in the stream wrapper might proof difficult, therefore.