Skip to content

Fix: ASIO Stream handling of close_notify in handshake#3801

Merged
reneme merged 1 commit intorandombit:masterfrom
Rohde-Schwarz:fix/tls_http_close_notify_in_handshake
Dec 6, 2023
Merged

Fix: ASIO Stream handling of close_notify in handshake#3801
reneme merged 1 commit intorandombit:masterfrom
Rohde-Schwarz:fix/tls_http_close_notify_in_handshake

Conversation

@FAlbertDev
Copy link
Copy Markdown
Collaborator

@FAlbertDev FAlbertDev commented Nov 6, 2023

Pull Request Dependencies

This is a follow-up to #3795 to close #3778.

TL;DR

If multiple TLS records are coalesced into a single byte blob on the wire, Botan's TLS implementation may experience multiple state changes in a single call to TLS::Channel::received_data(). The ASIO stream must disentangle those state changes for the downstream application to work correctly. We do this by buffering TLS state changes internally and reporting them to the ASIO application step-by-step.

Review Strategy Suggestion

Reviewers should re-iterate the described implementation strategy for the synchronous implementations of handshake() and read_some() first. The asynchronous implementations follow the same principle but are much more involved due to the callback-handle nature of the AsyncOperation idiom.

TLS States (in the ASIO stream wrapper)

Below is a sketch of the states a TLS connection will pass through in the eyes of the TLS ASIO stream wrapper. State transitions are signaled to the downstream application by the behavior of the stream's (asynchronouns) operations. Some state changes must be triggered by the downstream application itself. See the table for an overview of the state transitions.

image
Link to Excalidraw.com: https://excalidraw.com/#json=_CLKTOpdlnUCnVu0I6_mr,oDT9HxzM9l5BAKTt8F1uqw

State transition Stream behavior observable by downstream application
Init > Handshake Downstream app calls handshake() or async_handshake().
Handshake > Application Data (async_)handshake() returns without an error code or exception. Downstream application is now expected to call boost::asio::(async_)write(), ::(async_)read() or stream.(async_)shutdown() on the stream.
Application Data > Closing Downstream application called stream.(async_)shutdown() or ::(async_)read() returned "End of File" (after peer sent a CloseNotify). In the latter case the downstream application is expected to call stream.(async_)shutdown().
Closing > Closed Downstream application had called stream.(async_)shutdown() and it returns without an error code or exception.
Handshake > Closing (async_)handshake() returns with "End of File" (the peer has cancelled the handshake with a CloseNotify)
Any > Error Any ASIO function returns with an error code or exception other than "End of File".

Technical Details of the Problem

The ASIO stream wrapper interacts with the underlying TLS implementation via the commonly known APIs: most notably TLS::Channel::received_data() for incoming data from the peer, TLS::Channel::send() for outgoing application data. TLS state changes, decrypted application data from the peer and encrypted data meant to be sent to the peer are communicated via overloads of the TLS::Callbacks class. This happens when the downstream code calls TLS::Channel::received_data() or TLS::Channel::send() (among others).

The core of the problem: A single flight of TLS records received from the peer might cause more than one TLS stream state change (as outlined above). In that case, a single call to ::received_data() (inside the ASIO stream) may make the underlying TLS implementation state "skip" intermediate steps on first glance. The ASIO wrapper needs to be prepared for this and manage those intemediate state changes.

Below is an example, where the ASIO stream acts as a TLS server and a client finishes the handshake, sends some application data and closes the connection in a single flight of bytes (depicted by the yellow boxes). We assume that a single ::(async_)read_some() on the server's socket pulls the entire flight with three coalesced TLS records into the stream's read buffer. The TLS records are: (1) the client's Finished message, (2) some encrypted application data, and (3) a CloseNotify alert. This is perfectly legal behavior of the client.

image
Link to Excalidraw.com: https://excalidraw.com/#json=oGeAoDoG3RULU52W4fkAi,4Kh9hGhDhO7YmKB8lO3vlw

Note how the ASIO stream wrapper has to disentangle the state changes caused by the single flight of the client's Finished, Application Data and CloseNotify into multiple ASIO function calls and return codes (depicted by the green activation boxes).

The underlying bugfix: Before this rework, the ASIO stream wrapper had processed said coalesced flight without leaving the "Handshake" state. Nevertheless, the (async_)handshake() came back clean but the "End of File" error (that should have been triggered by the peer's CloseNotify) was dropped. A typical downstream application would then simply call (async_)read_some() on the stream to receive data. But this blocked forever -- as the peer had already declared their end of stream.

Implementation Strategy of the State Disentanglement

The TLS implementation provides enough predicate APIs to check for any relevant intermediate states (e.g. is_handshake_finished()). Additionally, the depicted state machine is a directed acyclic graph; therefore, a TLS::Channel that returns true on is_handshake_finished() will always return true for it. Similarly, once a CloseNotify (or any other TLS alert) was seen, there's no going back to a previous state.

Given those invariants, the (asynchronous) operations in the ASIO stream wrapper detect intermediate states simply by checking the relevant predicate methods in the correct order. That way, each operation reacts on the intermediate state changes relevant for them, only.

For instance, (async_)handshake() returns successfully as soon as is_handshake_finished() is true; even if the implementation has already read and decrypted some application data whose TLS record was coalesced with the peer's Finished message. A consecutive call to (async_)read_some() will find said application data in the output buffer and will push that to the downstream application first; even if the implementation has already received the peer's CloseNotify. New data is read from the peer only once neither pending decrypted application data is available nor a protocol error was detected.

The same is true for protocol errors: Stream operations deliberately prioritize their "ultimate goal" over checking for protocol errors. I.e. a handshake operation will return successfully once is_handshake_complete() is true. Even though some coalesced record following the Finished messages might already have resulted in some fatal protocol error. Under the "directed acyclic graph" assumption, the fatal error must have occurred AFTER the successful handshake and is therefore safe to ignore in the handshake operation. Assuming that the application will invoke additional stream operations after the successful handshake, any "retained" fatal protocol alert will surface in due time.

(Maybe) Future Work

We still see two issues with the above-described approach:

  1. Currently, the TLS stream does not support one-sided closure (as explicitly allowed in TLS 1.3)
    Once we receive a CloseNotify from the peer, the TLS implementation will, in turn, place a CloseNotify in the output buffer. Hence, a downstream application won't be able to send any more application data through this stream. This is the specified behavior of TLS 1.2, but TLS 1.3 would allow this. We may be able to fix that limitation in a future pull request.
  2. TLS 1.3's "Early Data" will complicate the above-described state machine
    This feature allows sending and receiving "early data" when resuming a session, even before the handshake is finished. This complicates the above-described state machine. Also, the current API of the stream wrapper doesn't account for this feature. Supporting "early data" in the stream wrapper might proof difficult, therefore.

@reneme reneme force-pushed the fix/tls_http_close_notify_in_handshake branch from 344ba4f to 6782373 Compare November 6, 2023 12:02
@reneme reneme marked this pull request as draft November 6, 2023 12:07
@reneme

This comment was marked as resolved.

@coveralls
Copy link
Copy Markdown

coveralls commented Nov 6, 2023

Coverage Status

coverage: 92.081% (+0.01%) from 92.069%
when pulling a8e28de on Rohde-Schwarz:fix/tls_http_close_notify_in_handshake
into 201cfe5 on randombit:master.

@reneme

This comment was marked as outdated.

@reneme reneme mentioned this pull request Nov 14, 2023
14 tasks
reneme added a commit to Rohde-Schwarz/botan that referenced this pull request Nov 14, 2023
A complete revamp of the state management in the ASIO stream wrapper to
properly disentangle TLS state transitions even when receiving coalesced
TLS records. Those can result in multiple state changes of Botan's
underlying TLS implementation.

For extensive technical details and a higher-level problem description,
please see randombit#3801.

Co-Authored-By: René Meusel <rene.meusel@rohde-schwarz.com>
@reneme reneme force-pushed the fix/tls_http_close_notify_in_handshake branch from a9042a3 to 83184a5 Compare November 14, 2023 15:11
@reneme reneme marked this pull request as ready for review November 14, 2023 15:31
@reneme reneme self-requested a review November 14, 2023 15:32
@reneme
Copy link
Copy Markdown
Collaborator

reneme commented Nov 14, 2023

@randombit Unfortunately, this escalated into an essential re-write of the ASIO stream wrapper's low level operations. The rationale and implementation strategy is outlined in detail in the pull request's description.

@FAlbertDev and I think, that the operations are now quite a bit clearer as well. Also, since the patch adds plenty of inline documentation for anyone to understand the implementation more easily in the future.

This finally fixes the nightly-build that is now broken for more than two weeks. >.< But we're now fairly confident that the ASIO stream is stable with this patch. After all, TLS Anvil throws a lot of corner cases on it.

reneme added a commit to Rohde-Schwarz/botan that referenced this pull request Nov 14, 2023
A complete revamp of the state management in the ASIO stream wrapper to
properly disentangle TLS state transitions even when receiving coalesced
TLS records. Those can result in multiple state changes of Botan's
underlying TLS implementation.

For extensive technical details and a higher-level problem description,
please see randombit#3801.

Co-Authored-By: René Meusel <rene.meusel@rohde-schwarz.com>
@reneme reneme force-pushed the fix/tls_http_close_notify_in_handshake branch from 83184a5 to 200e3f3 Compare November 14, 2023 15:50
reneme added a commit to Rohde-Schwarz/botan that referenced this pull request Nov 14, 2023
A complete revamp of the state management in the ASIO stream wrapper to
properly disentangle TLS state transitions even when receiving coalesced
TLS records. Those can result in multiple state changes of Botan's
underlying TLS implementation.

For extensive technical details and a higher-level problem description,
please see randombit#3801.

Co-Authored-By: René Meusel <rene.meusel@rohde-schwarz.com>
@reneme reneme force-pushed the fix/tls_http_close_notify_in_handshake branch from 200e3f3 to 4a5d432 Compare November 14, 2023 16:00
@reneme reneme requested review from lieser and randombit November 14, 2023 16:03
@reneme reneme added the bug label Nov 14, 2023
@reneme reneme added this to the Botan 3.3.0 milestone Nov 14, 2023
Copy link
Copy Markdown
Collaborator Author

@FAlbertDev FAlbertDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job with the documentation!

A complete revamp of the state management in the ASIO stream wrapper to
properly disentangle TLS state transitions even when receiving coalesced
TLS records. Those can result in multiple state changes of Botan's
underlying TLS implementation.

For extensive technical details and a higher-level problem description,
please see randombit#3801.

Co-Authored-By: René Meusel <rene.meusel@rohde-schwarz.com>
@reneme reneme force-pushed the fix/tls_http_close_notify_in_handshake branch from 4a5d432 to a8e28de Compare November 15, 2023 08:51
@reneme reneme merged commit 420ff86 into randombit:master Dec 6, 2023
@reneme reneme deleted the fix/tls_http_close_notify_in_handshake branch December 6, 2023 12:04
@reneme reneme mentioned this pull request Dec 7, 2023
Thiesius pushed a commit to Roboauto/botan that referenced this pull request May 20, 2024
Use furo Sphinx theme

Add OpenSSL 1.1 to Botan 3.x migration guide

FIX: handling of TLS protocol errors in the ASIO wrapper

A complete revamp of the state management in the ASIO stream wrapper to
properly disentangle TLS state transitions even when receiving coalesced
TLS records. Those can result in multiple state changes of Botan's
underlying TLS implementation.

For extensive technical details and a higher-level problem description,
please see randombit#3801.

Co-Authored-By: René Meusel <rene.meusel@rohde-schwarz.com>

Add check to SM2_PrivateKey::check_key() that private key x is in ]0, q-1[

./botan tls_client_hello can deal with TLS 1.3

Add Private_Key::remaining_operations() (randombit#3706)

FIX: clang-tidy warnings

Extend documentation for encryption and signing parameters

note disabled modules in build/build_config.json

Dynamically select a disabled module for the CMake test

Before we just hard-coded the TPM module. Though, this causes
issues when running the test in a build configuration where
TPM is actually available. Like in the BSI build configurations.

See also: sehlen-bsi/botan-docs#159

Mention CMake configuration in docs

./configure.py --without-cmake-config disables cmake integration

Add sig algos and key agreements to bench.py

Co-authored-by: René Meusel <github@renemeusel.de>

Update Doxygen landing page

Adding references to XOF (base class), PK_KEM_En/Decryptor, Kuznyechik,
BLAKE2b (as a MAC), KMAC, KEMs (Kyber, RSA), Dilithium, SPHINCS+ and
SHAKE (as a XOF)

FIX: don't assume that 'tpm' is generally not available

This is a follow-up of 971d7d5

Cleanup after merge master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Nightly TLS-Anvil CI failure with new HTTP server

4 participants