rpc: compression with libdeflate#20665
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the HTTP RPC compression middleware to use a dual-path gzip strategy: fast one-shot gzip for non-streaming JSON-RPC responses via go-libdeflate, while preserving incremental gzip streaming for streamable RPC methods by switching to stdlib compress/gzip when flushing is detected.
Changes:
- Injects an
http.Flusherhook into the per-request context to allow streamable RPC methods to trigger “streaming mode”. - Updates streamable RPC method execution to activate streaming compression before emitting any response bytes.
- Replaces the previous single-path gzip middleware with a buffered (libdeflate) vs streaming (stdlib gzip) implementation using multiple
sync.Pools.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
rpc/http.go |
Adds a request context key to carry a flush hook derived from the HTTP response writer. |
rpc/handler.go |
Streamable RPC methods call the context-provided flush hook before writing the JSON-RPC envelope. |
node/rpcstack.go |
Implements a buffered/streaming gzip middleware, introducing libdeflate one-shot compression and a streaming fallback. |
go.mod |
Adds the github.com/erigontech/go-libdeflate dependency. |
go.sum |
Records checksums for the new dependency. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@lupin012 The pool creates writers with gzip.NewWriter which uses What if reduce our default compression level of |
|
@AskAlexSharov ok I have changed the compression level of stdlib(streaming) from 6 to 1(BestSpeed) to have more speed and less compression |
6732a52 to
3b08fc4
Compare
…treaming) - gzipResponseWriter now buffers non-streaming responses and compresses them in one shot with libdeflate for maximum throughput - adds Flush() method to switch to stdlib gzip streaming mode for methods that produce large/trace responses incrementally - pools buf, dst slice, compressor and gzip.Writer to avoid per-request allocations - passes http.Flusher via context (httpFlusherContextKey) so runMethod can activate streaming compression before writing begins Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove local replace directive now that the module is published. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Return dst slice to pool only after w.Write() completes, not before. The previous order caused gzip corruption under high QPS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Flush() now flushes gzw + underlying http.Flusher on every call (not just on first activation), so streaming RPC methods deliver output incrementally instead of buffering until Close. - gzCompressorPool.New no longer panics on libdeflate init failure: logs once via sync.Once and returns nil; handler falls back to stdlib gzip. - On libdeflate compress error, fall back to stdlib gzip instead of returning http.Error (which would overwrite the JSON-RPC payload). - httpFlusherContextKey is now injected only by the gzip middleware via WithGzipStreamingHook, not from any generic http.Flusher, preventing premature HTTP header commit (e.g. 200 before 503) when gzip is off. - gzBufPool and gzDstPool only retain buffers <= gzPoolBufCap (1 MiB) to bound steady-state RSS after large responses. - stdlib gzip pool uses BestSpeed (level 1) to prioritise latency. - Extract writeStdlibGzip helper to eliminate duplicated fallback logic. - Add unit tests covering: non-streaming, streaming, status propagation, large body pool-cap path, Flush activation, and pool threshold. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Added gzip optimizations for non-streaming RPC responses: skip compression for small payloads (< 1 KB), where the CPU overhead of setting up the compressor outweighs the benefit and — for very small responses — the compressed output can end up larger than the input due to gzip framing overhead; also set Content-Length from the known compressed size when using libdeflate, avoiding unnecessary Transfer-Encoding: chunked. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Use libdeflate for one-shot (non-streaming) compression; fall back to stdlib gzip for streaming responses - Add libdeflateDisabled atomic.Bool to short-circuit pool.Get after first init failure (sync.Pool discards nil, avoiding repeated NewCompressor calls) - Extract sendGzipResponse and compressLibdeflate sub-functions so all pool returns use defer - Add getBuf() helper to encapsulate Get+Reset and prevent missed resets - Add gzDstGrow with append-style 2x capacity growth to amortize reallocs - Store []byte directly in gzDstPool (was *[]byte) - Use two defers (LIFO) in sendGzipResponse streaming path so Close runs before Put Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
eb664e6 to
992ac2b
Compare
close #17112
a) Non-streaming responses (standard JSON-RPC calls such as eth_getBlockByNumber):
b) Streaming responses (e.g. debug_traceTransaction, trace_filter) are detected via http.Flusher: when the RPC handler calls Flush() before writing, the middleware switches to stdlib compress/gzip in streaming mode, compressing trace data incrementally without buffering the full response.
Changes:
🚀 Performance Benchmarks: Gzip Optimization
1. Isolated Compression Benchmarks (libdeflate vs stdlib)
Diff
Benchmark (Gzip isolation) Latency Throughput Mem Alloc
Note: We observed a ~1.75x speedup on a single thread. Under high concurrency, the advantage is even greater due to reduced CPU overhead.
2. eth_getBlockByNumber with txs (Old SW vs Main SW)
Old SW - Results with instability and errors:
Diff
Main SW - Stable results with 100% success rate:
Diff
3. trace_block
4. Executes all RPC using http, http-compressed and websockets
./run_all.sh -T http,http_comp,websocket
Run tests in parallel on localhost:8545/localhost:8551
Result directory: /home/simon/silkworm/tests/rpc-tests3/integration/results
Time: 2026-04-19 08:58:05.257624
Total round_trip time: 3:40:01.359044
Total marshalling time: 0:00:00.063308
Total unmarshalling time: 0:01:40.509521
No of json Diffs: 0
Test time-elapsed: 0:08:37.467460
Available tests: 1436
Available tested api: 112
Number of loop: 1
Number of executed tests: 4152
Number of NOT executed tests: 156
Number of success tests: 4152
Number of failed tests: 0