Skip to content

[WIP] SslStream Kernel TLS ofloading#125110

Draft
rzikm wants to merge 16 commits intodotnet:mainfrom
rzikm:ktls-poc
Draft

[WIP] SslStream Kernel TLS ofloading#125110
rzikm wants to merge 16 commits intodotnet:mainfrom
rzikm:ktls-poc

Conversation

@rzikm
Copy link
Member

@rzikm rzikm commented Mar 3, 2026

Contributes to #66224.

This experiment special-cases SslStream when used with NetworkStream to make OpenSSL talk directly do the underlying socket. This allows us to enable kTLS on supported platforms which will offload encryption/decryption of outgoing/incoming data to the in-kernel TLS implementation, and, if supported by NIC, even to specialized networking hardware.

The functionality in the PoC is gated by DOTNET_SYSTEM_NET_SECURITY_KTLS=1

rzikm and others added 14 commits February 25, 2026 10:48
Add kernel TLS (kTLS) offload support as a PoC, togglable via
DOTNET_SYSTEM_NET_SECURITY_KTLS=1 environment variable.

When SslStream wraps a NetworkStream on Linux, this enables OpenSSL's
kTLS integration by using socket BIOs (SSL_set_fd) instead of memory
BIOs. After the TLS handshake, the kernel handles encryption/decryption
of application data, potentially with hardware offload.

Changes:
- Native PAL: Add SSL_set_fd, kTLS query, and blocking SSL I/O functions
  with internal poll loop for non-blocking socket support
- OpenSSL shim: Add SSL_set_fd, SSL_get_fd, SSL_get_wbio, SSL_get_rbio
  as lightup functions
- Managed interop: P/Invoke declarations, SafeSslHandle.CreateForKtls
- Interop.OpenSsl: AllocateSslHandleForKtls, DoSslHandshakeKtls,
  KtlsRead, KtlsWrite
- SslStream.IO: Detection of NetworkStream + env var, kTLS handshake
  path, kTLS read/write paths using Task.Run with blocking native calls

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace blocking Task.Run + poll() approach with proper async I/O:

- Handshake: non-blocking SSL_do_handshake loop with zero-byte reads
  on InnerStream (via TIOAdapter) for WANT_READ
- Read: non-blocking SSL_read with zero-byte reads for WANT_READ
- Write: non-blocking SSL_write with Task.Yield() for WANT_WRITE

This avoids blocking threadpool threads and integrates with .NET's
epoll-based async socket infrastructure.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
With socket BIOs on non-blocking sockets, OpenSSL may return
SSL_ERROR_SYSCALL with errno=EAGAIN instead of SSL_ERROR_WANT_READ
or SSL_ERROR_WANT_WRITE. This happens because ERR_clear_error() is
called before SSL operations, leaving the error queue empty, so
SSL_get_error() falls through to SSL_ERROR_SYSCALL.

Handle this in the managed kTLS code via IsKtlsWantRead() helper
that checks Marshal.GetLastPInvokeError() for EAGAIN, avoiding
changes to the shared native PAL functions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Debug Console.WriteLine statements between SSL_read/SSL_write P/Invoke
calls and the subsequent errno check were corrupting the saved errno.
Console.WriteLine internally calls Interop.Sys.Write (SetLastError=true),
which overwrites Marshal.GetLastPInvokeError() before IsKtlsWantRead
could read it. This caused genuine EOF (ret=0 + SSL_ERROR_SYSCALL +
errno=0) to be misinterpreted as EAGAIN/WANT_READ, leading to an
infinite wait on a zero-byte read that would never complete.

Fix:
- Remove all debug Console.WriteLine statements from kTLS paths
- Capture errno immediately after SSL P/Invoke calls via out parameters
  in TryKtlsRead/TryKtlsWrite, before any other P/Invoke can run
- Pass captured errno to IsKtlsWantRead instead of reading it later
- Handle ret=0 + SSL_ERROR_SYSCALL + errno=SUCCESS as EOF explicitly

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove unused blocking wrappers (DoSslHandshakeKtls, KtlsRead,
KtlsWrite) from Interop.OpenSsl.cs and their P/Invoke declarations
(SslDoHandshakeBlocking, SslReadBlocking, SslWriteBlocking) from
Interop.Ssl.cs. These were from the initial blocking I/O approach and
are no longer called since switching to the async pattern.

Also remove: empty static constructor, unused 'ns' local variable.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three performance improvements for the kTLS PoC:

1. Fix busy-spin on WANT_READ: Zero-byte reads on Socket complete
   immediately (SocketAsyncContext fast-paths them without a syscall).
   Replace with 1-byte MSG_PEEK recv which properly waits via epoll
   for data availability without consuming it from the kernel buffer.
   Applies to both handshake and read paths.

2. Fix busy-spin on WANT_WRITE: Replace Task.Yield() (immediate
   reschedule = CPU burn) with Task.Delay(1) to give the socket send
   buffer time to drain. WANT_WRITE is rare in practice.

3. Enable SSL_CTX caching and TLS session resume: The kTLS path now
   reuses cached SSL_CTX handles and sets TLS sessions for repeat
   connections, matching the behavior of the normal SslStream path.
   This avoids full TLS handshakes on subsequent connections to the
   same host.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The two functions were nearly identical — the only difference was
SafeSslHandle.Create vs CreateForKtls (memory BIOs vs socket BIO).
Add an optional socketFd parameter (default -1) to AllocateSslHandle
and branch at the Create call. This eliminates ~120 lines of
duplicated SSL configuration code.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
With blocking sockets (the default for new Socket instances),
SSL_do_handshake blocks the thread pool thread during recv/send calls.
In loopback test scenarios where both client and server handshakes
run as tasks on the thread pool, this causes thread pool starvation
and hangs.

Set socket.Blocking = false before the handshake. This is necessary
because SSL_do_handshake bypasses SocketAsyncContext (calling recv/send
directly on the fd), so we can't rely on SocketAsyncContext's lazy
non-blocking initialization. With non-blocking sockets, the handshake
loop properly returns WANT_READ/WANT_WRITE and uses peek-based async
readiness waiting.

All 4957 tests pass with DOTNET_SYSTEM_NET_SECURITY_KTLS=1.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Only set socket.Blocking = false when using the async handshake path
(AsyncReadWriteAdapter). For the sync path, blocking is expected by
the caller. The JIT constant-folds the typeof comparison.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two improvements to the kTLS read path based on benchmark trace analysis:

1. Buffer filling: After a successful SSL_read, continue reading to fill the
   caller's buffer instead of returning immediately. This avoids extra
   peek+SSL_read syscall round trips when multiple TLS records are already
   buffered in the kernel. For a 64KB response (~4 TLS records), this reduces
   from 8 syscalls to ~6 and eliminates 3 async iterations.

2. Connection close handling: Catch SocketException from the MSG_PEEK recv
   used for readiness notification. When the peer closes a kTLS connection,
   recv() may return ECONNRESET instead of clean EOF. The exception is caught
   and the loop continues to SSL_read which determines the actual TLS-level
   status (SSL_ERROR_ZERO_RETURN for clean closure, or a real error).
   This eliminates ~276K exceptions/15s seen in h11-get-close benchmarks.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove PollForSsl, SslDoHandshakeBlocking, SslReadBlocking, and
SslWriteBlocking from pal_ssl.c. These were part of the initial kTLS
implementation but are no longer used since async I/O was implemented
in managed code using MSG_PEEK readiness notification.

Also remove the unused errno.h and poll.h includes, and fix
entrypoints.c ordering to keep kTLS-related entries sorted
alphabetically.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ring

- When kTLS TX is active, bypass SSL_write and use Socket.SendAsync
  directly — the kernel encrypts transparently on send()
- Add 32KB read-ahead buffer to amortize SSL_read (recvmsg) calls,
  reducing epoll_wait syscalls per request
- Track kTLS TX state via _ktlsTx field set after handshake
- Remove unused _ktlsRx field (socket-direct reads don't work due
  to NewSessionTicket records requiring OpenSSL's record layer)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the custom _ktlsReadBuffer with the existing _buffer (SslBuffer)
infrastructure. SSL_read returns already-decrypted plaintext, so we
Commit + OnDecrypted(0, ret, ret) to treat it as decrypted data in
_buffer, reusing the same CopyDecryptedData consume path.

When the caller's buffer is >= ReadBufferSize, read directly into it
to avoid a copy (zero-copy fast path for large reads).

This eliminates 4 fields (_ktlsReadBuffer, offset, count, buffer size
constant) and significantly improves large-response scenarios:
- H1.1 64KB: -9.2% -> +1.6% (neutral)
- H2 64KB: -34.2% -> -26.0%

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The direct socket read bypass for kTLS RX doesn't work with TLS 1.3
because OpenSSL must process post-handshake messages (NewSessionTicket)
that arrive as non-application-data records on the socket. Bypassing
SSL_read causes these records to be mixed into the data stream.

Instead, keep SSL_read for all kTLS paths (TX and RX). The kernel
decryption benefit of kTLS RX still applies transparently when OpenSSL
calls recv/recvmsg on the kTLS socket.

Added a buffer-fill loop after successful SSL_read: since each SSL_read
returns at most one TLS record (~16KB), we immediately retry to fill
more of the read-ahead buffer or user buffer without re-waiting. This
amortizes the MSG_PEEK wait cost for large responses.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 3, 2026 11:44
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl, @bartonjs, @vcsjones
See info in area-owners.md if you want to be subscribed.

@rzikm
Copy link
Member Author

rzikm commented Mar 3, 2026

copilot-generated report:

Kernel TLS (kTLS) Offloading — Performance Report

Executive Summary

This report presents the performance characteristics of kernel TLS (kTLS) offloading
for .NET's SslStream, benchmarked using the ASP.NET HttpClient benchmark suite on
dedicated performance lab machines. kTLS offloads TLS encryption/decryption from
userspace OpenSSL to the Linux kernel's tls module, eliminating data copies between
kernel and userspace for the cryptographic operations.

Three modes are compared:

  • Standard — default SslStream using OpenSSL with memory BIOs (baseline)
  • DirectOpenSSL — OpenSSL reads/writes directly to the socket (socket BIO) without kTLS offload,
    isolating the overhead of SslStream's memory-BIO abstraction layer
  • kTLS — OpenSSL with kernel TLS offload enabled (socket BIO + kernel crypto)

Key findings:

  • H1.1 small-payload keep-alive (0B): kTLS improves throughput +6–8% on 12- and 28-core machines
    by eliminating kernel↔userspace data copies. DirectOpenSSL is 3–5% slower than Standard,
    confirming that memory-BIO overhead is minimal and kTLS's gain comes from kernel crypto offload.
  • H1.1 large-payload (64KB): at parity on 12-core machines, but −12% on 28-core where
    the extra syscall overhead saturates client CPU at 97%.
  • HTTP/2 large-payload (64KB): −32% to −51% regression due to a Linux kernel limitation
    (one decrypted TLS record per recvmsg() call).
  • H1.1 POST (64KB) and H2 POST (64KB): at parity (within ±2%).
  • Connection-close: −10% to −24% regression from per-connection kTLS setup cost.

Test Configuration

Parameter Value
Runtime .NET 11.0.0-preview.3
OpenSSL 3.5.5
kTLS Toggle DOTNET_SYSTEM_NET_SECURITY_KTLS=1
DirectOpenSSL Toggle DOTNET_SYSTEM_NET_SECURITY_DIRECT_OPENSSL=1
kTLS/DirectOpenSSL Scope Client-side only
Benchmark aspnet/benchmarks HttpClient via crank
Concurrency 100 concurrent requests per HttpClient
Iterations 3 per configuration (results averaged)
Duration 15s warmup + 15s measurement per iteration

Machines

Label Profile Cores NIC Driver
perf-lin aspnet-perf-lin 12 ixgbe (10Gbps)
citrine-lin aspnet-citrine-lin 28 i40e (40Gbps)
gold-lin aspnet-gold-lin 56 mlx5_core ConnectX-6 Dx (40Gbps)

Each profile uses separate client and server machines on a dedicated network.
None of the machines have hardware TLS offload enabled — all kTLS operations
use the kernel's software AES-GCM implementation.


Throughput — perf-lin (12 cores)

Scenario Mode RPS Δ% Client CPU p50 (ms) p99 (ms)
H1.1 GET keep-alive, 0B standard 272,244 58% 0.332 0.992
H1.1 GET keep-alive, 0B directopenssl 262,531 🔴 -3.6% 61% 0.344 0.969
H1.1 GET keep-alive, 0B ktls 290,027 🟢 +6.5% 58% 0.311 0.918
H1.1 GET keep-alive, 64KB standard 17,782 50% 5.583 12.294
H1.1 GET keep-alive, 64KB directopenssl 17,782 +0.0% 53% 5.353 13.334
H1.1 GET keep-alive, 64KB ktls 17,781 -0.0% 65% 5.446 10.752
H1.1 GET conn-close, 0B standard 6,992 96% 13.673 28.468
H1.1 GET conn-close, 0B directopenssl 6,237 🔴 -10.8% 84% 15.421 32.252
H1.1 GET conn-close, 0B ktls 5,300 🔴 -24.2% 89% 17.989 42.166
H2 GET, 0B standard 200,142 62% 0.475 1.092
H2 GET, 0B directopenssl 203,255 +1.6% 62% 0.464 1.120
H2 GET, 0B ktls 203,801 +1.8% 65% 0.463 1.128
H2 GET, 64KB standard 12,478 68% 7.983 10.889
H2 GET, 64KB directopenssl 10,164 🔴 -18.5% 64% 9.763 11.511
H2 GET, 64KB ktls 6,088 🔴 -51.2% 67% 7.316 9.155
H1.1 POST keep-alive, 64KB standard 17,895 26% 4.240 21.329
H1.1 POST keep-alive, 64KB directopenssl 17,857 -0.2% 25% 4.313 18.903
H1.1 POST keep-alive, 64KB ktls 17,862 -0.2% 29% 4.281 19.649
H2 POST, 64KB standard 13,454 45% 7.190 10.030
H2 POST, 64KB directopenssl 13,597 +1.1% 45% 7.083 10.081
H2 POST, 64KB ktls 13,302 -1.1% 52% 7.217 10.043

Throughput — citrine-lin (28 cores)

Scenario Mode RPS Δ% Client CPU p50 (ms) p99 (ms)
H1.1 GET keep-alive, 0B standard 404,681 65% 0.210 1.016
H1.1 GET keep-alive, 0B directopenssl 383,407 🔴 -5.3% 68% 0.221 1.026
H1.1 GET keep-alive, 0B ktls 436,487 🟢 +7.9% 66% 0.203 0.973
H1.1 GET keep-alive, 64KB standard 68,277 78% 1.325 3.691
H1.1 GET keep-alive, 64KB directopenssl 68,074 -0.3% 86% 1.323 3.650
H1.1 GET keep-alive, 64KB ktls 60,249 🔴 -11.8% 97% 0.760 15.834
H1.1 GET conn-close, 0B standard 8,763 80% 10.983 19.606
H1.1 GET conn-close, 0B directopenssl 8,417 🔴 -4.0% 76% 11.484 20.529
H1.1 GET conn-close, 0B ktls 7,868 🔴 -10.2% 94% 12.073 24.246
H2 GET, 0B standard 130,820 62% 0.748 1.620
H2 GET, 0B directopenssl 125,697 🔴 -3.9% 50% 0.764 1.746
H2 GET, 0B ktls 130,798 -0.0% 69% 0.747 1.633
H2 GET, 64KB standard 7,346 36% 13.806 18.682
H2 GET, 64KB directopenssl 6,358 🔴 -13.5% 28% 15.828 20.264
H2 GET, 64KB ktls 4,973 🔴 -32.3% 45% 20.169 24.005
H1.1 POST keep-alive, 64KB standard 68,185 49% 1.396 3.304
H1.1 POST keep-alive, 64KB directopenssl 67,143 -1.5% 47% 1.419 3.423
H1.1 POST keep-alive, 64KB ktls 67,374 -1.2% 49% 1.414 3.396
H2 POST, 64KB standard 7,823 26% 12.567 17.970
H2 POST, 64KB directopenssl 7,882 +0.7% 25% 12.489 17.536
H2 POST, 64KB ktls 7,722 -1.3% 25% 12.725 17.964

Throughput — gold-lin (56 cores)

Scenario Mode RPS Δ%
H1.1 GET keep-alive, 0B standard 547,489
H1.1 GET keep-alive, 0B directopenssl 505,820 🔴 -7.6%
H1.1 GET keep-alive, 0B ktls 471,695 🔴 -13.8%
H1.1 GET keep-alive, 64KB standard 67,225
H1.1 GET keep-alive, 64KB directopenssl 66,988 -0.4%
H1.1 GET keep-alive, 64KB ktls 67,413 +0.3%
H1.1 GET conn-close, 0B standard 1,247 ⚠️ ⚠️
H1.1 GET conn-close, 0B directopenssl 1,407 ⚠️ 🟢 +12.8% ⚠️
H1.1 GET conn-close, 0B ktls 17,118 🟢 +1272.7%
H2 GET, 0B standard 66,278
H2 GET, 0B directopenssl 47,545 🔴 -28.3%
H2 GET, 0B ktls 50,682 🔴 -23.5%
H2 GET, 64KB standard 6,952
H2 GET, 64KB directopenssl 5,939 🔴 -14.6%
H2 GET, 64KB ktls 5,777 🔴 -16.9%
H1.1 POST keep-alive, 64KB standard 68,450
H1.1 POST keep-alive, 64KB directopenssl 67,232 -1.8%
H1.1 POST keep-alive, 64KB ktls 67,921 -0.8%
H2 POST, 64KB standard 7,217
H2 POST, 64KB directopenssl 7,095 -1.7%
H2 POST, 64KB ktls 6,217 🔴 -13.9%

⚠️ marks results affected by high exception counts (connection teardown issues
specific to the gold-lin Docker container environment). Latency and CPU data
not available from gold-lin terminal output.


Analysis

Where kTLS helps: H1.1 small-payload keep-alive

kTLS shows a consistent +6–8% throughput improvement for H1.1 GET keep-alive with
0B response bodies on perf-lin and citrine-lin. The gain comes from eliminating the
kernel→userspace→kernel data copy path: with kTLS, the kernel encrypts/decrypts data
in-place during send()/recv(), avoiding the round-trip through OpenSSL's userspace
buffers.

DirectOpenSSL (socket BIO without kTLS) is 3–5% slower than Standard, which
demonstrates that SslStream's memory-BIO abstraction adds negligible overhead. The
kTLS improvement comes entirely from the kernel crypto offload, not from bypassing
the BIO layer.

Where kTLS regresses: H2 GET 64KB (−32% to −51%)

HTTP/2 with large response bodies has a fundamental regression caused by a Linux kernel
limitation: tls_sw_recvmsg() delivers exactly one decrypted TLS record per syscall
and sets the MSG_EOR flag. OpenSSL's ktls_read_record() enforces this constraint.

A 64KB HTTP/2 response is split across ~4 TLS records (each up to 16KB). Without kTLS,
a single recv() can read all 4 encrypted records at once, and OpenSSL decrypts them
in a batch. With kTLS, each record requires a separate recvmsg() syscall, plus the
HTTP/2 framing layer must process each record's data separately.

This is compounded by HTTP/2's multiplexing: flow control windows and stream processing
create additional overhead per read. This is a fundamental Linux kernel limitation
with no userspace workaround.

Where kTLS regresses: H1.1 GET 64KB on high-core machines (−12%)

On citrine-lin (28 cores), the client CPU reaches 97% with kTLS (vs 78% Standard),
indicating CPU saturation. The root cause is ~3.6× more total I/O syscalls: each TLS
record requires a separate recvmsg(), plus 1-byte MSG_PEEK reads are used for
socket readiness detection on non-blocking kTLS sockets.

On perf-lin (12 cores), this scenario shows no regression because the CPU is not
saturated (client CPU 50–65%). The regression is proportional to available CPU headroom.

Where kTLS regresses: connection-close (−10% to −24%)

Each kTLS connection requires additional setsockopt calls to install TLS TX and RX
crypto parameters into the kernel. With Connection: close, every request creates a
new connection, so this per-connection setup cost cannot be amortized. The overhead
results in ~2.3× more syscalls per request compared to Standard.

POST scenarios: at parity

H1.1 POST 64KB and H2 POST 64KB show no meaningful difference (within ±2%) across
all modes. For upload-heavy workloads, kTLS TX offload handles encryption efficiently
and the per-record kernel limitation doesn't apply to the write path.


Gold-lin (56 cores) — Notes

The gold-lin machine (aspnet-gold-lin) runs benchmark jobs inside Docker containers.
Connection-close scenarios on this machine exhibit high exception counts for Standard
and DirectOpenSSL modes (tens of thousands of exceptions per run), while kTLS is
unaffected. This appears to be a container/network environment issue rather than a
code defect. The connection-close results for gold-lin should be disregarded.

The gold-lin machine has a Mellanox ConnectX-6 Dx NIC (mlx5_core driver) which
supports hardware TLS offload, but this feature is disabled in firmware
(tls-hw-tx-offload: off [fixed]). All kTLS operations use software crypto.


Known Limitations and Future Optimizations

Limitation Impact Possible Mitigation
Kernel delivers 1 TLS record per recvmsg() H2 GET 64KB: −32–51% None (kernel limitation); skip kTLS RX for H2
Extra syscalls saturate CPU at high concurrency H1.1 GET 64KB: −12% on 28c Replace MSG_PEEK with epoll_wait
Per-connection setsockopt for kTLS setup conn-close: −10–24% Skip kTLS for short-lived connections
No hardware TLS offload on test machines Software kTLS only Enable firmware TLS on ConnectX-6 Dx (gold-lin)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Linux/OpenSSL kTLS proof-of-concept in SslStream to avoid bypassing SSL_read on RX (to correctly handle TLS 1.3 post-handshake records like NewSessionTicket), while still attempting to benefit from kernel kTLS offload when OpenSSL reads from a kTLS-enabled socket.

Changes:

  • Add new native OpenSSL interop exports for socket-BIO + kTLS enablement/status checks.
  • Add a Linux/OpenSSL kTLS handshake/read/write path in SslStream (env-var gated), including a post-SSL_read loop to fill buffers without re-waiting.
  • Add managed P/Invokes and SafeSslHandle support for creating an OpenSSL SSL* bound directly to a socket fd.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/native/libs/System.Security.Cryptography.Native/pal_ssl.h Declares new native exports for setting fd, enabling kTLS, and querying kTLS send/recv.
src/native/libs/System.Security.Cryptography.Native/pal_ssl.c Implements the new native exports using OpenSSL APIs.
src/native/libs/System.Security.Cryptography.Native/opensslshim.h Adds lightup entries for additional SSL symbols needed by the new exports.
src/native/libs/System.Security.Cryptography.Native/entrypoints.c Registers the new exports for managed interop.
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.Ssl.cs Adds P/Invokes and SafeSslHandle support for kTLS socket-BIO creation.
src/libraries/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs Extends SSL handle allocation for optional socket-fd based creation; exposes GetSslError internally.
src/libraries/System.Net.Security/src/System/Net/Security/SslStream.IO.cs Adds the Linux/OpenSSL kTLS handshake and I/O paths, including buffer-fill logic on reads.
src/libraries/System.Net.Security/src/System.Net.Security.csproj Makes System.Console a non-conditional dependency (currently used for debug logging).
Comments suppressed due to low confidence (3)

src/libraries/System.Net.Security/src/System/Net/Security/SslStream.IO.cs:1279

  • Using Task.Delay(1) as a stand-in for socket writability can turn sustained backpressure into a tight wake/sleep loop (high CPU, poor scalability). Instead of polling delays, consider using a proper readiness mechanism (e.g., an async wait on writability if available, or a backoff strategy / Socket.Poll-based wait on a dedicated thread) so this doesn’t busy-wait under load.
                        if (error == Interop.Ssl.SslErrorCode.SSL_ERROR_WANT_WRITE ||
                            IsKtlsWantRead(error, errno)) // SYSCALL+EAGAIN can also mean socket buffer full
                        {
                            // Wait briefly for socket buffer to drain. Socket doesn't expose
                            // an async writability wait, but WANT_WRITE is rare and short-lived.
                            await Task.Delay(1, cancellationToken).ConfigureAwait(false);
                            continue;

src/native/libs/System.Security.Cryptography.Native/pal_ssl.c:1322

  • CryptoNative_SslSetFd doesn’t validate inputs or follow the existing pattern of asserting non-null parameters used by neighboring exports (e.g., CryptoNative_SslStapleOcsp asserts ssl != NULL). Please add consistent asserts (at least assert(ssl != NULL)) for easier debugging and to match established conventions in this file.
int32_t CryptoNative_SslSetFd(SSL* ssl, int32_t fd)
{
    ERR_clear_error();
#ifdef FEATURE_DISTRO_AGNOSTIC_SSL
    if (SSL_set_fd_ptr == NULL)
    {
        return 0;
    }
#endif
    return SSL_set_fd(ssl, fd);
}

src/libraries/System.Net.Security/src/System.Net.Security.csproj:472

  • System.Net.Security.csproj now references System.Console unconditionally. This increases the baseline dependency surface for the library and appears to be solely to support the new Console.WriteLine debug logging. Please remove the Console.WriteLine usage and revert this reference to its prior conditional form (or otherwise avoid taking an unconditional System.Console dependency).
    <ProjectReference Include="$(LibrariesProjectRoot)System.Collections\src\System.Collections.csproj" />
    <ProjectReference Include="$(LibrariesProjectRoot)System.Collections.Concurrent\src\System.Collections.Concurrent.csproj" />
    <ProjectReference Include="$(LibrariesProjectRoot)System.Collections.NonGeneric\src\System.Collections.NonGeneric.csproj" />
    <ProjectReference Include="$(LibrariesProjectRoot)System.Console\src\System.Console.csproj" />

Comment on lines +1013 to +1017
while (true)
{
Memory<byte> readTarget = useReadAhead ? _buffer.AvailableMemory : buffer;
int ret = TryKtlsRead(sslHandle, readTarget, out Interop.Ssl.SslErrorCode error, out Interop.Error errno);

Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kTLS read path calls SSL_read without taking the existing _handshakeLock used by EncryptData/DecryptData to serialize OpenSSL operations. This can allow concurrent SSL_read/SSL_write on the same SSL* from different threads, which isn’t consistent with the rest of SslStream’s synchronization model and can lead to racey failures/corruption. Please wrap SSL_read/SSL_write (and any related OpenSSL state access) in the same lock used elsewhere, or otherwise ensure thread-safety is preserved.

Copilot uses AI. Check for mistakes.
while (totalWritten < buffer.Length)
{
ReadOnlyMemory<byte> remaining = buffer.Slice(totalWritten);
int ret = TryKtlsWrite(sslHandle, remaining, out Interop.Ssl.SslErrorCode error, out Interop.Error errno);
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The kTLS write path calls SSL_write in a loop without taking the existing _handshakeLock used for OpenSSL operations elsewhere. This can run concurrently with SSL_read on another thread and violates the established synchronization pattern in SslStream. Please add appropriate locking or another mechanism to ensure the OpenSSL SSL* is not accessed concurrently.

Suggested change
int ret = TryKtlsWrite(sslHandle, remaining, out Interop.Ssl.SslErrorCode error, out Interop.Error errno);
Interop.Ssl.SslErrorCode error;
Interop.Error errno;
int ret;
lock (_handshakeLock)
{
ret = TryKtlsWrite(sslHandle, remaining, out error, out errno);
}

Copilot uses AI. Check for mistakes.
Comment on lines +1335 to +1348
int32_t CryptoNative_SslGetKtlsSend(SSL* ssl)
{
#ifndef OPENSSL_NO_KTLS
return BIO_get_ktls_send(SSL_get_wbio(ssl));
#else
(void)ssl;
return 0;
#endif
}

int32_t CryptoNative_SslGetKtlsRecv(SSL* ssl)
{
#ifndef OPENSSL_NO_KTLS
return BIO_get_ktls_recv(SSL_get_rbio(ssl));
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These kTLS helpers call BIO_get_ktls_send/recv and SSL_get_wbio/rbio without any distro-agnostic/lightup safety. In FEATURE_DISTRO_AGNOSTIC_SSL scenarios the runtime may load an OpenSSL lib that lacks these symbols (or where SSL_get_wbio/rbio weren’t found), which can lead to missing-symbol load failures or null-function-pointer calls. Please add proper LIGHTUP_FUNCTION entries for BIO_get_ktls_send/recv (and null checks for SSL_get_wbio_ptr/SSL_get_rbio_ptr and the BIO_get_ktls_* ptrs), returning 0 when unavailable.

Copilot uses AI. Check for mistakes.
{
_ktlsTx = Interop.Ssl.SslGetKtlsSend(sslHandle) != 0;
_ktlsRx = Interop.Ssl.SslGetKtlsRecv(sslHandle) != 0;
System.Console.WriteLine($"kTLS handshake completed: kTLS recv: {(_ktlsRx ? 1 : 0)}, kTLS send: {(_ktlsTx ? 1 : 0)}");
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handshake path logs kTLS status with System.Console.WriteLine. Console output from a networking library is not appropriate and can’t be controlled by consumers. Please remove this and use existing tracing/telemetry (e.g., NetEventSource/Activity/Telemetry) if you need observability.

Suggested change
System.Console.WriteLine($"kTLS handshake completed: kTLS recv: {(_ktlsRx ? 1 : 0)}, kTLS send: {(_ktlsTx ? 1 : 0)}");
if (NetEventSource.Log.IsEnabled())
{
NetEventSource.Info(this, $"kTLS handshake completed: kTLS recv: {(_ktlsRx ? 1 : 0)}, kTLS send: {(_ktlsTx ? 1 : 0)}");
}

Copilot uses AI. Check for mistakes.
rzikm and others added 2 commits March 4, 2026 08:57
kTLS: use Socket.ReceiveAsync for kTLS RX instead of recvmsg

Replace the custom recvmsg+cmsg P/Invoke with plain Socket.ReceiveAsync
for the kTLS RX read path. On a kTLS RX socket, recv() transparently
returns decrypted application data from the kernel. When a non-application
TLS record (NewSessionTicket, KeyUpdate) is at the queue head, recv()
returns EIO — we catch this and use SSL_read to consume the control
record, then retry.

The fill loop uses synchronous socket.Receive() on the non-blocking
socket to gather additional TLS records without waiting, breaking on
WouldBlock.

This removes the need for the native CryptoNative_KtlsRecvMsg function
and all associated recvmsg/cmsg infrastructure.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

kTLS: replace Task.Delay(1) with Socket.Poll for WANT_WRITE

Use Socket.Poll(SelectMode.SelectWrite) instead of Task.Delay(1) to
wait for socket writability when SSL_do_handshake or SSL_write returns
SSL_ERROR_WANT_WRITE. Poll uses the poll() syscall which returns as
soon as the socket is writable, avoiding the minimum 1ms scheduling
delay of Task.Delay.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

kTLS: add Debug.Fail for unexpected WANT_WRITE paths

WANT_WRITE was never observed across 400K+ requests on both local
and ASP.NET perf lab benchmarks (loopback and real network). Add
Debug.Fail to both handshake and write WANT_WRITE handlers so we
notice immediately if this assumption is ever violated.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

kTLS: remove fill loop from KtlsSocketReadAsync

The fill loop added a synchronous recv() call after each async ReceiveAsync.
For small responses this always returned EAGAIN, causing a SocketException
throw/catch per request (visible as 0.13% EH.DispatchEx in CPU profiles).

Unlike SSL_read which returns one TLS record per call (due to recvmsg with
cmsg), plain recv() on a kTLS socket already returns all available decrypted
data in a single call. The fill loop provided no benefit while doubling the
recv syscall count.

CPU profile improvement (H1.1 keepalive 0B):
- recv exclusive: 15.97% → 1.09%
- Active CPU: 18.46% → 2.43% (vs 6.30% for non-kTLS baseline)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

kTLS: use non-throwing SAEA-based ReceiveAsync for kTLS RX reads

Replace try/catch SocketException pattern with a cached
KtlsReceiveEventArgs (SAEA + IValueTaskSource<int>) that returns -1
on error instead of throwing. This eliminates the exception
throw/catch overhead on every NewSessionTicket/KeyUpdate record
(typically 1-2 per TLS 1.3 connection).

The KtlsReceiveEventArgs instance is lazily allocated, cached on
SslStream._ktlsRecvArgs, and disposed in CloseInternal.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

kTLS: add DirectOpenSSL experiment mode

Add DOTNET_SYSTEM_NET_SECURITY_DIRECT_OPENSSL=1 mode that uses
SSL_set_fd (direct socket BIO) for OpenSSL I/O without enabling
kTLS offload. This allows measuring the overhead of SslStream's
memory-BIO abstraction layer vs direct OpenSSL socket I/O.

In DirectOpenSSL mode:
- Handshake: same as kTLS (SSL_do_handshake on socket fd)
- Reads: SSL_read with MSG_PEEK readability wait (no kTLS RX)
- Writes: SSL_write directly on socket fd (no kTLS TX)
- No kernel crypto offload, no Socket.SendAsync/ReceiveAsync bypass

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

kTLS: replace custom SAEA with Socket.ReceiveAsync + try/catch

Remove KtlsReceiveEventArgs (custom SAEA implementing IValueTaskSource<int>)
and use Socket.ReceiveAsync(Memory<byte>) with a filtered catch for the rare
EIO exception from kTLS non-application TLS records (NewSessionTicket).

The custom SAEA caused thread pool lock contention (+0.44pp CPU at 100
connections) due to the public SocketAsyncEventArgs completion path using
different scheduling than the internal AwaitableSocketAsyncEventArgs.

The catch is filtered to SocketError.SocketError (unmapped errno, i.e. EIO)
so genuine socket errors like ConnectionReset propagate normally.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
For kTLS and DirectOpenSSL modes, SSL_shutdown attempted I/O directly
on the non-blocking socket via socket BIO, causing EAGAIN to surface
as SSL_ERROR_SYSCALL and be treated as a fatal error.

Fix: keep quiet shutdown enabled for socket BIO handles so SSL_shutdown
sets internal flags without attempting I/O. Skip GenerateToken in
CreateShutdownToken since there is no output memory BIO to read from.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants