rpc, exec: wire receipts.Generator to shared derive package, add RCacheV2 fast path#20485
Conversation
| if len(receipt.Logs) > 0 { | ||
| receipt.FirstLogIndexWithinBlock = uint32(receipt.Logs[0].Index) | ||
| } else if i > 0 { | ||
| receipt.FirstLogIndexWithinBlock = receipts[i-1].FirstLogIndexWithinBlock + uint32(len(receipts[i-1].Logs)) |
There was a problem hiding this comment.
i guess this logic needs to be in receipts.DeriveFields method
| select { | ||
| case <-ctx.Done(): | ||
| evm.Cancel() | ||
| case <-txDone: |
There was a problem hiding this comment.
intent of txDone channel is unclear
yperbasis
left a comment
There was a problem hiding this comment.
Issues
- Off-by-one in RCacheV2 txNum computation (Medium — cosmetic, not a bug)
In DerivePriorReceipts:
// txNum for user tx i is blockStartTxNum + 1 (system tx) + i
txNum := blockStartTxNum + 1 + uint64(i)
The comment claims blockStartTxNum is the system tx's txNum and adds +1 to skip past it. But tracing the callers, blockStartTxNum = TxNum - TxIndex. From exec3_serial.go:145-170, tasks are assigned with txIndex
starting at -1:
system start tx: txNum = B, txIndex = -1
user tx 0: txNum = B+1, txIndex = 0
user tx i: txNum = B+1+i, txIndex = i
system end tx: txNum = B+N+1, txIndex = N
So TxNum - TxIndex = (B+1+i) - i = B+1 — that's already user tx 0's txNum, not the system tx's. The +1 in the formula double-counts the system tx offset, making the lookup at B+2+i instead of B+1+i.
This is not a functional bug because HistorySeek uses "at or before" semantics: seeking at B+2+i still finds the receipt written at B+1+i. But the comment and intent are misleading. The correct formula should
be:
txNum := blockStartTxNum + uint64(i)
- Missing EVM cancellation watcher in fast path (Medium)
The original RPC GetReceipts had a goroutine per transaction that called evm.Cancel() on context cancellation, enabling mid-opcode abort:
go func() {
select {
case <-ctx.Done():
evm.Cancel()
case <-txDone:
}
}()
The new fast path via DeriveBlockReceipts → DeriveForRange only checks ctx.Done() between transactions. A single gas-heavy transaction (e.g., a large contract deployment) won't be interrupted until it
completes, even if the RPC timeout fires. For blocks with expensive transactions, the fast path is less responsive to timeouts than the old code.
The slow path (pre-Byzantium) retains the cancel watcher, so this only affects post-Byzantium blocks through the RPC layer. Consider adding the cancel-watcher pattern into DeriveForRange, or documenting that
RPC callers accept this trade-off.
- execution/receipts now depends on db/rawdb (Design observation)
PR #20467's execution/receipts/derive.go was pure execution logic (no DB dependencies). This PR adds imports for db/kv and db/rawdb to support the RCacheV2 lookup. This means the shared package is no longer
purely about replay — it also knows about the persistent cache layer. An alternative would be to keep cache lookup in callers and pass cached receipts into DerivePriorReceipts via a callback/option. Not
blocking, but worth considering for maintainability.
When execution resumes mid-block from a snapshot boundary, Finalize needs the complete receipt set for requests hash computation (deposit extraction from logs). Previously this was attempted with inline transaction replay that had multiple bugs (wrong task, unsafe type assertion, missing Release). Fix: extract receipt derivation into execution/receipts/ — a shared package that both the execution pipeline and RPC layer can use. The new receipts.DerivePriorReceipts() replays transactions 0..startIdx-1 using a HistoryReaderV3 and returns their receipts. Both serial and parallel executors now call DerivePriorReceipts when startTxIndex > 0, prepending the result before passing to Finalize. Fixes #20452
…heV2 fast path RPC receipts Generator (GetReceipts): post-Byzantium blocks now use execution/receipts.DeriveBlockReceipts on cache miss instead of inline replay. Pre-Byzantium path (needs commitment history for post-state root) stays inline. This unifies the replay logic with the execution pipeline. execution/receipts.DerivePriorReceipts: try RCacheV2 (persistent receipt cache) first before falling back to transaction replay. If all prior receipts are already cached from a previous execution run, no replay is needed at all.
…ument txDone channel Move the BlockHash and FirstLogIndexWithinBlock fixup logic from the RPC generator into receipts.DeriveFields in the shared derive package, so both the fast path (post-Byzantium) and slow path (pre-Byzantium) use it. Add a comment explaining the txDone channel's role as a cancellation bridge between context cancellation and EVM abort. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Address review feedback from @AskAlexSharov: - DeriveForRange now sets BlockHash and FirstLogIndexWithinBlock on each receipt, so callers don't need inline fixup code - Remove duplicate fixup from RPC Generator's fast path - Add comment explaining the txDone cancel-watcher goroutine pattern in the pre-Byzantium slow path
…watcher, document txDone Three review comments addressed: 1. Off-by-one in RCacheV2 txNum (yperbasis): blockStartTxNum is already the first user tx's txNum (B+1), not the system tx (B). The +1 in `blockStartTxNum + 1 + i` double-counted the system tx offset. Fixed to `blockStartTxNum + i`. Not a functional bug (HistorySeek uses at-or-before semantics) but the comment and intent were wrong. 2. Missing EVM cancel watcher in fast path (yperbasis): DeriveForRange now spawns a per-tx goroutine that calls evm.Cancel() on context cancellation, matching the pattern in the slow (pre-Byzantium) path. Without this, a gas-heavy transaction would run to completion even after the RPC timeout fires. 3. txDone channel intent (AskAlexSharov): added comment explaining that txDone signals the cancel-watcher goroutine to exit after normal tx completion, preventing goroutine leaks. Note: yperbasis observation #3 (execution/receipts depends on db/rawdb) is acknowledged as a design trade-off. The RCacheV2 fast path in DerivePriorReceipts adds the dependency. An alternative (callback-based cache injection) is cleaner but more complex. Keeping as-is for v1.
985dd8d to
dd09776
Compare
MemoryMutation.HistorySeek and HistoryRange were left as
`panic("not supported")` placeholders with the correct delegation
commented out. Every sibling temporal-read method on MemoryMutation
already delegates to the backing tx — GetLatest, GetAsOf, RangeAsOf,
IndexRange, HistoryStartFrom. The two outliers just hadn't been
wired when nothing called them.
A caller appeared in PR #20485 (RCacheV2 fast path in
DerivePriorReceipts): when a serial-executor block is reached via
ExecModule.updateForkChoice, reads flow through the block overlay
(MemoryMutation). ReadReceiptCacheV2 issues a HistorySeek on the
RCacheDomain to read a receipt for a txNum strictly older than the
block being processed. That panics the executor at
execution/stagedsync/exec3_serial.go:402 via DerivePriorReceipts.
Delegating to the backing tx is correct, not a workaround:
- The overlay never writes to history / domain tables
(grep -rnE "indexKeys|historyKey|InvertedIndexWriter|DomainBufferedWriter"
in db/kv/membatchwithdb/ is empty). The overlay's scope is
block-level metadata: headers, TDs, bodies, canonical hashes,
BAL bytes (see execution/execmodule/inserters.go).
- History writes take the DomainBufferedWriter / SharedDomains path,
which is a separate layer. Any HistorySeek routed through the
overlay is asking for data that lives exclusively in committed
DB state or frozen snapshot files — the base tx is authoritative.
- The commented-out line in the original code already contained the
correct implementation.
Repro: wipe chaindata/ while retaining snapshots/ (post-Fusaka
mainnet) and restart. The first forkchoice-triggered execution
resumes mid-block (startTxIndex > 0), hits
DerivePriorReceipts -> ReadReceiptCacheV2 -> HistorySeek -> panic.
After this change the overlay forwards both calls to m.db and the
sync makes progress.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tx (#20624) ## Summary - Fix a `panic(\"not supported\")` that aborts post-Fusaka mainnet sync resumed from snapshots-only (chaindata wiped, snapshots kept). - `MemoryMutation.HistorySeek` and `HistoryRange` were placeholders; the correct delegation was already commented out. This PR just flips them to delegate to the backing tx, matching every sibling temporal-read method. ## Repro On a post-Fusaka mainnet datadir, keep `snapshots/`, wipe `chaindata/`, and start `erigon`. The first forkchoice-triggered execution resumes mid-block (`startTxIndex > 0`) and takes the partial-block receipt reconstruction path added in #20452. That calls `receipts.DerivePriorReceipts`, which on top of the RCacheV2 fast path (#20485) issues `tx.HistorySeek(kv.RCacheDomain, …)`. When the containing executor runs through `ExecModule.updateForkChoice`, the tx is the block overlay (`MemoryMutation`), and `HistorySeek` panics: ``` panic: not supported goroutine … [running]: …/db/kv/membatchwithdb.(*MemoryMutation).HistorySeek .../db/kv/membatchwithdb/memory_mutation.go:872 …/db/rawdb.ReadReceiptCacheV2 .../db/rawdb/accessors_chain.go:1276 …/execution/receipts.DerivePriorReceipts .../execution/receipts/derive.go:193 …/execution/stagedsync.(*serialExecutor).executeBlock.func2 .../execution/stagedsync/exec3_serial.go:402 … …/execution/execmodule.(*ExecModule).updateForkChoice .../execution/execmodule/forkchoice.go:473 ``` ## Why delegation is correct (not a workaround) - Every other temporal-read method on `MemoryMutation` already delegates to the backing tx: `GetLatest`, `GetAsOf`, `RangeAsOf`, `IndexRange`, `HistoryStartFrom`. The two outliers just hadn't been wired. - The block overlay never writes to history / domain tables — its scope is block-level metadata (headers, TDs, bodies, canonical hashes, BAL bytes, see `execution/execmodule/inserters.go`). History writes flow through `DomainBufferedWriter` / `SharedDomains`, not through the overlay. `grep -rnE \"indexKeys|historyKey|InvertedIndexWriter|DomainBufferedWriter\" db/kv/membatchwithdb/` is empty. - Any `HistorySeek` routed through the overlay is therefore asking for data that lives exclusively in committed DB state or frozen snapshot files. The base tx is authoritative; the overlay has nothing to add. - The original author even left the correct call commented out on the same line. ## Test plan - [x] `make lint` — 0 issues - [x] `make erigon` — builds clean - [x] `go test -short ./db/kv/membatchwithdb/...` — passes - [ ] Manual: post-Fusaka mainnet, wipe `chaindata/` / keep `snapshots/`, restart — sync now progresses past the first forkchoice-triggered execution instead of panicking (next hit in our testing was the unrelated stale-read bug tracked in follow-up). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com>
Summary
Follow-up to #20467. Wires the RPC receipts Generator to the shared
execution/receipts/package and adds an RCacheV2 fast path in the execution pipeline.Changes
rpc/jsonrpc/receipts/receipts_generator.goGetReceiptspost-Byzantium path: replaced inline replay loop withexecution/receipts.DeriveBlockReceipts()callexecution/receipts/derive.goDerivePriorReceipts: tries RCacheV2 (persistent receipt cache) first before falling back to transaction replayFinal structure
Test plan
make lintpassesmake erigon integrationbuildsDepends on: #20467 (partial block receipt fix)