execution/types, engineapi, cl: avoid RLP re-encode of payload txs#21546
Merged
Conversation
…ad txs
Engine newPayload receives transactions as already-RLP-encoded bytes from
the CL. The current entry path:
1. Decode the raw RLP bytes into types.Transaction values.
2. Construct a *types.Block from those decoded transactions.
3. Call HandleNewPayload -> InsertBlocksAndWaitWithAccessLists.
4. Inside, blocksToRaw calls Block.RawBody(), which loops over
b.transactions and calls rlp.EncodeToBytes(txn) for every tx.
Step 4 re-encodes bytes we already had at step 1. For a ~200-tx mainnet
block at ~5–15 µs per RLP encode that's ~1–3 ms of pure waste per block,
plus the alloc churn (one []byte per tx via rlp's bytes.Buffer pool).
Add Block.rawTransactions [][]byte as a caller-supplied cache, plus a
companion constructor NewBlockFromStorageWithRawTxs. When the cache is
populated (and length-matches the decoded txs), Block.RawBody() returns
the cached bytes directly. engineapi.newPayload wires the raw bytes
from req.Transactions through to the new constructor — both slices
reference the same underlying buffers, no copy.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The rawTransactions cache added on this branch holds each tx's canonical EIP-2718 binary encoding (the engine_newPayload wire form). RawBody() returned those bytes verbatim, but rlp.EncodeToBytes — the path it replaced — wraps a typed tx in an outer RLP string, so for typed txs the cached output was shorter than before by that string prefix. That under-counts the block size in RawBlock.ValidateMaxRlpSize, so a block over the EIP-7934 RLP block-size limit validated as VALID instead of being rejected (the EEST enginex RLP_BLOCK_LIMIT_EXCEEDED failures). It also diverged the bytes persisted to kv.EthTx — copied verbatim into snapshots — from every other ingestion path, which all store the wrapped form. Reconstruct the rlp.EncodeToBytes form from the cached bytes instead: a cheap RLP-string re-wrap for typed txs, unchanged for legacy txs. The encode-skip optimization is preserved (no per-tx field walk) while the output is byte-identical to the non-cached path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nary", not "raw" "raw" was ambiguous here: Block.RawBody() returns the RLP-string-wrapped form, while the cache added on this branch holds the unwrapped form. Rename the unwrapped-form identifiers to "binary", matching the existing BinaryTransactions type: rawTransactions -> binaryTransactions NewBlockFromStorageWithRawTxs -> NewBlockFromStorageWithBinaryTxs rlpFromCanonicalTxn -> rlpFromBinaryTxn No behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Use the existing BinaryTransactions type for the cache field and the NewBlockFromStorageWithBinaryTxs parameter instead of bare [][]byte, so the type itself documents the encoding. [][]byte stays assignable to it, so the engine_server.go caller is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Avoids redundant RLP re-encoding of transactions in engine_newPayload by threading the original raw tx byte slices from the CL through a new optional cache on *types.Block, which RawBody() consults before falling back to per-tx rlp.EncodeToBytes. Yields ~7% off median engine_newPayloadV4 wall time on mainnet tip.
Changes:
- Adds
binaryTransactions BinaryTransactionsfield onBlockplusNewBlockFromStorageWithBinaryTxsconstructor;RawBody()uses cheaprlpFromBinaryTxnrewrap when the cache is populated. - Wires the engine API's
newPayloadto construct theBlockwith the rawtxsslice as the cache. - Adds a unit test verifying
RawBody()from the cache matches the encoded reference across legacy, AccessList, DynamicFee, Blob, and SetCode tx types.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| execution/types/block.go | New binaryTransactions cache, constructor, and RawBody() fast path via rlpFromBinaryTxn. |
| execution/types/block_test.go | Test ensures cached RawBody() matches the reference encoding for all tx types. |
| execution/engineapi/engine_server.go | Builds the block with the raw CL tx bytes attached as the cache. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…t paths The RawBody() encode-skip optimization previously reached only the engine API path (engine_server.newPayload), so external CLs and the EnableEngineAPI embedded Caplin benefited, but the default embedded Caplin did not: ExecutionClientDirect.NewPayload and the historical block_collector both build the block with plain NewBlockFromStorage and then re-encode every tx in RawBody() during insertion. Both already hold the binary tx encodings (body.Transactions, just decoded), so pass them to NewBlockFromStorageWithBinaryTxs. Behavior-neutral: RawBody() output is byte-identical (covered by TestBlockRawBodyFromBinaryTxsMatchesEncoded). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
taratorio
approved these changes
Jun 3, 2026
domiwei
approved these changes
Jun 3, 2026
sudeepdino008
added a commit
that referenced
this pull request
Jun 3, 2026
- Merge `origin/main` (up to #21546) into the `performance` branch. - Conflicts resolved by taking main's finalized form where the perf branch was behind (`ExistenceFilterVersion`→1 per #21164, `mdbx-go`→v0.40.1, `merge.go` `findMergeRangeInFiles` refactor, dropped `erigon-snapshot` module dep, fusefilter deferred-close refactor, `Versions.MustSupport`, atomic per-key prune throttle). - Adopted main's collation-at-tip design (`CollateAndPrune` in the FCU path, #21398/#21415) and removed the perf branch's older `frozenBlocks`-gating (`SetFrozenBlocksProvider`/`MaxCollatableTxNum`, `db/services/snapshot_progress.go`, its gating tests, and callers). - Verified: `make erigon integration` build, `make lint` (clean), `make test-short` (green).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When a block enters the execution layer — via
engine_newPayloadfrom an external CL, or via Caplin's internal-CL insert paths — each transaction arrives as its binary (canonical EIP-2718) encoding. We decode those to build a*types.Block, then a few stack frames laterBlock.RawBody()re-encodes every transaction withrlp.EncodeToBytesto form the body for insertion — redundant work, since the bytes were already in hand at entry.This PR caches the binary encodings on the
*types.Blockand reuses them:binaryTransactions BinaryTransactionsfield onBlock(nil= no cache; existing call sites unchanged).NewBlockFromStorageWithBinaryTxs, used at theengine_newPayloadcall site and Caplin'sExecutionClientDirect+ historicalblock_collector. The cache aliases the caller's existing buffers — no copy.Block.RawBody()rebuilds each transaction's canonical block-body RLP from the cached bytes — re-wrapping typed txs in their EIP-2718 RLP-string envelope, legacy txs unchanged. Byte-identical to therlp.EncodeToBytesoutput it replaces, but skips the per-tx struct walk; covered byTestBlockRawBodyFromBinaryTxsMatchesEncoded.Performance
Matched-pair A/B between this PR's head (
28e99358) and main pinned ata5a464c8132c, fed the same mainnet payloads via engine-mux. Each EL pinned to 4 P-cores / 8 hyperthreads on i9-13900KS, n=201 both-VALIDengine_newPayloadV4matched pairs from a steady-state tip window:a5a464c828e99358main/prat p50 = 1.010×)VmRSS): main 40.3 GB vs PR 41.3 GB — +1 GB on the PR side, within this rig's run-to-run RAM variance.Per-tx encode cost is small (~5–15 µs) but mainnet blocks carry ~200 txs, so the cumulative skip is consistent block over block.