transport: Pool read buffers used by the HTTP/2 framer#9032
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #9032 +/- ##
==========================================
+ Coverage 80.55% 82.25% +1.69%
==========================================
Files 413 413
Lines 33541 42322 +8781
==========================================
+ Hits 27020 34810 +7790
- Misses 4305 6105 +1800
+ Partials 2216 1407 -809
🚀 New features to boost your workflow:
|
24267f8 to
b32d8ba
Compare
|
I'm splitting this PR into smaller PRs. I'll re-open this when the child PRs are merged. |
### Background In #9032, we will transition from standard `net.Conn.Read` methods to `syscall` UNIX APIs to enable non-memory-pinning reads. Due to this change, the Go race detector beings failing on tests that share state between client and server goroutines without standard synchronization primitives (mutexes, channels, etc.). Because these tests rely on the network request itself as a memory barrier, they are logically safe but technically racy from the Go runtime's perspective. The standard `net.Conn` leverages Go's internal network poller, which inadvertently provides the "happens-before" edges the race detector looks for. Dropping down to raw syscalls bypasses this, causing the detector to flag the accesses. (Minimal repro: https://go.dev/play/p/yvEtBmLTOJ2) ### Solution Introduce explicit synchronization to the affected tests. Tests now properly coordinate shared state access between clients and servers without relying on socket I/O timing. RELEASE NOTES: N/A
…ll connection handling (#9035) This PR eliminates per-call heap allocations in `nonBlockingReader.ReadOnReady()`. Previously, calling [`RawConn.Read`](https://pkg.go.dev/syscall#RawConn.Read) with an inline closure caused captured variables (and the closure itself) to escape to the heap. To resolve this, we moved the closure's required state and return values into fields on the `nonBlockingReader` struct. The state is set before execution, and the results are read afterward. Because the closure now only relies on the receiver's fields, we instantiate it exactly once during `nonBlockingReader` construction and reuse it, completely avoiding allocations on the hot path. Additionally, this change updates the types for which non-blocking reads are performed to the following: * Unwrapped types created by `net.Dial`. This avoid reading encrypted data from [credentials.syscallConns](https://github.com/grpc/grpc-go/blob/74b3acd1a801570e1cefb28cf61620a4ef7c8ee2/internal/credentials/syscallconn.go#L37-L42). * Types that already implement the `ReadyReader` interface themselves to support encrypted connections. These changes are a prerequisite for #9032. ## Benchmarks #9032 uses `nonBlockingReader` instead of standard `net.Conn.Read` and confirms zero increase in heap allocations for streaming RPCs. RELEASE NOTES: N/A --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
This PR introduces a buffered `io.Reader` that automatically releases its read buffer when empty. To optimize memory usage, the reader defers buffer reallocation until data is available in the underlying `readyreader.Reader` (achieved by calling `ReadOnReady`). In a follow-up PR (#9032), the HTTP/2 framer will be updated to utilize this new buffered reader whenever the underlying reader implements the `readyreader.Reader` interface. The implementation and associated tests are based on the [standard library's](https://cs.opensource.google/go/go/+/refs/tags/go1.26.2:src/bufio/bufio.go;l=35). RELEASE NOTES: N/A --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
c16297c to
8b8d538
Compare
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements read buffer pooling for the HTTP framer and ALTS record protocol to reduce memory usage, particularly when subchannels are idle. It introduces a new environment variable GRPC_GO_EXPERIMENTAL_HTTP_FRAMER_READ_BUFFER_POOLING to control this feature. Key changes include updating the ALTS conn to support ReadOnReady, switching to syscall in readyreader to avoid race detector issues, and refactoring buffer pool management in http_util. Feedback was provided regarding the ALTS record protocol to ensure the returned pooled buffer's slice header is correctly updated to match the decrypted data length.
| if !envconfig.EnableHTTPFramerReadBufferPooling { | ||
| return bufio.NewReaderSize(r, bufSize) | ||
| } | ||
| if rr := readyreader.NewNonBlocking(r); rr != nil { | ||
| readPool := getIOBufferPool(bufSize) | ||
| return readyreader.NewBuffered(rr, bufSize, readPool) | ||
| } | ||
| return bufio.NewReaderSize(r, bufSize) |
There was a problem hiding this comment.
Nit: Can we simplify this as:
- If env var is enabled **and** `r` supports non-blocking reads, create a `readyreader.NewBuffered` and return
- Fall though and create a regular bufio.Reader using `bufio.NewReaderSize`
There was a problem hiding this comment.
To check if r supports non-blocking reads, we have to call NewNonBlocking. When using feature flags, we should generally avoid invoking the protected code at all to prevent unexpected side effects (e.g., panics). Since Go evaluates if initialization statements before the condition, we cannot safely combine the assignment and the flag check on a single line. We would have to nest the conditionals to ensure the flag is evaluated first:
if envconfig.EnableHTTPFramerReadBufferPooling {
if rr := readyreader.NewNonBlocking(r); rr != nil {
readPool := getIOBufferPool(bufSize)
return readyreader.NewBuffered(rr, bufSize, readPool)
}
}To avoid this nesting and keep the code flat, I opted to use an early return pattern instead. I'm fine the nesting style also.
There was a problem hiding this comment.
Nesting is what I meant, but I just wrote it as a single conditional in the pseudo code.
The only reason I ask for the nesting is because currently we have two code paths that do the same return bufio.NewReaderSize(r, bufSize). With the nesting, there will just be one of them. But it's not a big deal. Will leave it to you.
There was a problem hiding this comment.
Changed to nested style.
Original PRs: #9055, #9032 RELEASE NOTES: * transport: Pool HTTP/2 framer read buffers to reduce idle memory consumption. Currently limited to Linux for ALTS and non-encrypted transports (TCP, Unix). To disable, set `GRPC_GO_EXPERIMENTAL_HTTP_FRAMER_READ_BUFFER_POOLING=false` and report any issues. --------- Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Problem
The HTTP/2 framer in gRPC uses a
bufio.Readerwith a 32KB buffer by default. When there are a large number of transports, these buffers consume significant memory, even when the transport is idle.Solution
#8964 added a
ReadyReaderinterface that allows non-memory-pinning reads. This PR replaces the standardbufio.Readerwith a customio.Readerimplementation that uses pooled buffers and releases the buffer once all data is consumed.To defer the re-allocation of the read buffer, the reader calls
ReadOnReadyon the underlyingio.Reader. For this to work, the underlyingio.Readermust implement either theReadyReaderinterface orsyscall.RawConn. If neither condition is met, the framer gracefully falls back to using the regularbufio.Reader.Additional Changes:
ReadyReaderinterface.mem.BufferPoolinterface, allowing the pools to be shared across both read and write operations.syscall.Readinstead ofunix.Readto avoid triggering the race detector, see comment for details.Benchmarks
In a real-world benchmark, where a GCS directpath client downloads a file in a loop, the average "in use" memory falls from 28.3MB to 21.3MB (-24%).
Local Benchmarks show no significant difference
RELEASE NOTES:
GRPC_GO_EXPERIMENTAL_HTTP_FRAMER_READ_BUFFER_POOLING=falseand report any issues.