Implement debug_executionWitness#20205
Conversation
Replace string(common.FromHex(addr.Hex()[2:] + key.Hex()[2:])) with string(append(addr.Bytes(), key.Bytes()...)) in touchAllKeys and buildExpectedPostState. The hex roundtrip adds unnecessary allocations and is fragile with zero-padded addresses. Co-authored-by: shuo <shuo@erigon.dev>
There was a problem hiding this comment.
Pull request overview
Implements Erigon’s debug_executionWitness JSON-RPC endpoint to produce execution witnesses suitable for zk prover pipelines, leveraging commitment history to support historical blocks.
Changes:
- Add
ExecutionWitnessRPC implementation that executes a block on historical state, records accesses, and builds a witness trie + auxiliary data (codes/keys/headers). - Extend commitment/witness infrastructure to support collapse tracing and witnessing intermediate (hashed) key paths.
- Add trie RLP encode/decode helpers and expand tests/fixtures to cover new witness edge cases.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| rpc/jsonrpc/debug_execution_witness.go | New ExecutionWitness endpoint implementation + stateless verification path |
| rpc/jsonrpc/debug_api.go | Exposes ExecutionWitness on the debug API interface |
| rpc/jsonrpc/debug_api_test.go | Adds ExecutionWitness tests; adjusts existing debug API tests for updated chain |
| rpc/jsonrpc/eth_block_test.go | Updates expected block hash due to changed test chain |
| execution/commitment/hex_patricia_hashed.go | Adds collapse tracer plumbing + fixes/extends witness generation for new edge cases |
| execution/commitment/hex_patricia_hashed_test.go | Adds tests for newly handled witness edge cases and hashed-key touching |
| execution/commitment/commitmentdb/commitment_context.go | Exposes TouchHashedKey, collapse tracer, and custom history reader hooks via commitment context |
| execution/commitment/commitment.go | Adds Updates.TouchHashedKey to witness intermediate trie paths |
| execution/commitment/trie/trie.go | Adds Trie.RLPEncode and RLPDecode helpers for witness serialization |
| execution/commitment/trie/encoding_test.go | Adds encode/decode roundtrip test including accounts + storage subtries |
| execution/commitment/keys_nibbles.go | Adds NibblesToString helper for debugging/logging |
| cmd/rpcdaemon/rpcdaemontest/test_util.go | Extends generated test chain to exercise storage-delete/collapse scenarios |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| // latest block is 11, nil means single-block query | ||
| n2 = rpc.BlockNumber(12) | ||
| _, err = api.GetModifiedAccountsByNumber(m.Ctx, rpc.BlockNumber(11), &n2) |
There was a problem hiding this comment.
In this test, the result of GetModifiedAccountsByNumber is discarded (_, err = ...) but the subsequent require.Len(t, result, 3) still asserts on the previous result value. Assign the returned slice to result (or assert on the returned value) so the test actually validates the 11..12 range call.
| _, err = api.GetModifiedAccountsByNumber(m.Ctx, rpc.BlockNumber(11), &n2) | |
| result, err = api.GetModifiedAccountsByNumber(m.Ctx, rpc.BlockNumber(11), &n2) |
| // Add account keys | ||
| for addr := range allAddresses { | ||
| result.Keys = append(result.Keys, addr.Bytes()) | ||
| } | ||
|
|
||
| // Add storage keys | ||
| for addr, keys := range allStorageKeys { | ||
| for key := range keys { | ||
| // Storage keys are represented as composite keys (address + key) | ||
| compositeKey := append(addr.Bytes(), key.Bytes()...) | ||
| result.Keys = append(result.Keys, compositeKey) | ||
| } |
There was a problem hiding this comment.
result.Keys is built by iterating over maps (allAddresses, allStorageKeys), which produces nondeterministic ordering across runs. For an RPC response this can cause flaky tests/clients and makes witnesses hard to diff/debug. Consider sorting addresses/keys (e.g., lexicographically by bytes) before appending to result.Keys.
| // result.Codes: pre-state bytecodes the stateless verifier needs to execute calls | ||
| codesSeen := make(map[common.Hash]struct{}) | ||
| for _, code := range preStateCode { | ||
| if len(code) > 0 { | ||
| h := crypto.Keccak256Hash(code) | ||
| if _, dup := codesSeen[h]; !dup { | ||
| result.Codes = append(result.Codes, code) | ||
| codesSeen[h] = struct{}{} | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
result.Codes is populated by ranging over preStateCode (a map), so the order of returned bytecodes is nondeterministic even though duplicates are deduped. For stable RPC output, collect into a slice and sort by code hash (or code bytes) before assigning to result.Codes.
| resetToParentState := func() (txNum uint64, blockNum uint64, err error) { | ||
| sdCtx.SetHistoryStateReader(tx, firstTxNumInBlock) | ||
| return domains.SeekCommitment(context.Background(), tx) | ||
| } | ||
|
|
||
| // === STEP 1: Collapse Detection via ComputeCommitment === | ||
| // Detect trie node collapses by running the full commitment calculation for this block. | ||
| // When a FullNode is reduced to a single child (e.g., due to storage deletes), | ||
| // the remaining child's data must be included in the witness for correct | ||
| // state root computation during stateless execution. | ||
| // | ||
| // We only record sibling paths (without building any witness) in this first step, because the grid | ||
| // is mutated during ComputeCommitment and would produce incorrect root hashes. | ||
| var collapseSiblingPaths [][]byte | ||
|
|
||
| // Set up split reader: branch data from parent state, plain state from end of block | ||
| // need withHistory=false to have branch updates written using PutBranch() | ||
| splitStateReader := commitmentdb.NewSplitHistoryReader(tx, firstTxNumInBlock, endTxNum, false /* withHistory */) | ||
| sdCtx.SetCustomHistoryStateReader(splitStateReader) | ||
| if _, _, err := domains.SeekCommitment(context.Background(), tx); err != nil { | ||
| return nil, fmt.Errorf("failed to re-seek commitment for collapse detection: %w", err) | ||
| } |
There was a problem hiding this comment.
These commitment seeks use context.Background() instead of the RPC request ctx, which prevents request cancellation/timeouts from stopping potentially heavy work (commitment seek + collapse detection). Pass ctx through to SeekCommitment (and any other long-running operations) so the RPC call is cancellable.
| if err != nil || blockHeader == nil { | ||
| continue | ||
| } | ||
|
|
||
| headerRLP, err := rlp.EncodeToBytes(blockHeader) | ||
| if err != nil { | ||
| continue |
There was a problem hiding this comment.
Header collection silently ignores missing headers / encoding errors (continue on error or nil header). Since these headers directly affect BLOCKHASH results, returning an incomplete Headers list can yield an invalid witness with a hard-to-diagnose root mismatch. Prefer returning an explicit error when a requested header can't be loaded/encoded.
| if err != nil || blockHeader == nil { | |
| continue | |
| } | |
| headerRLP, err := rlp.EncodeToBytes(blockHeader) | |
| if err != nil { | |
| continue | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to load header for accessed block number %d: %w", bn, err) | |
| } | |
| if blockHeader == nil { | |
| return nil, fmt.Errorf("missing header for accessed block number %d", bn) | |
| } | |
| headerRLP, err := rlp.EncodeToBytes(blockHeader) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to encode header for accessed block number %d: %w", bn, err) |
| func (t *Updates) TouchHashedKey(hashedKey []byte) { | ||
| switch t.mode { | ||
| case ModeDirect: | ||
| dedupKey := string(hashedKey) | ||
| if _, ok := t.keys[dedupKey]; !ok { | ||
| var err error | ||
| if !t.sortPerNibble { | ||
| err = t.etl.Collect(hashedKey, []byte{}) | ||
| } else { | ||
| err = t.nibbles[hashedKey[0]].Collect(hashedKey, []byte{}) | ||
| } | ||
| if err != nil { |
There was a problem hiding this comment.
Updates.TouchHashedKey can panic on empty input: in ModeDirect+sortPerNibble it indexes hashedKey[0] without checking length. Since this is now part of the public commitment API (TouchHashedKey), add a defensive len(hashedKey)==0 guard (and return early or log an error).
| dedupKey := string(hashedKey) | ||
| if _, ok := t.keys[dedupKey]; !ok { | ||
| var err error | ||
| if !t.sortPerNibble { | ||
| err = t.etl.Collect(hashedKey, []byte{}) | ||
| } else { | ||
| err = t.nibbles[hashedKey[0]].Collect(hashedKey, []byte{}) | ||
| } | ||
| if err != nil { | ||
| log.Warn("failed to collect hashed key", "hashedKey", fmt.Sprintf("%x", hashedKey), "err", err) |
There was a problem hiding this comment.
In ModeDirect, TouchHashedKey passes the caller-provided hashedKey slice directly to ETL (Collect) and also uses string(hashedKey) for dedup. If the caller reuses/mutates the slice after calling (possible for scratch buffers), this can corrupt the collected keys. Copy hashedKey before collecting (similar to ModeUpdate which uses common.Copy).
| dedupKey := string(hashedKey) | |
| if _, ok := t.keys[dedupKey]; !ok { | |
| var err error | |
| if !t.sortPerNibble { | |
| err = t.etl.Collect(hashedKey, []byte{}) | |
| } else { | |
| err = t.nibbles[hashedKey[0]].Collect(hashedKey, []byte{}) | |
| } | |
| if err != nil { | |
| log.Warn("failed to collect hashed key", "hashedKey", fmt.Sprintf("%x", hashedKey), "err", err) | |
| hashedKeyCopy := common.Copy(hashedKey) | |
| dedupKey := string(hashedKeyCopy) | |
| if _, ok := t.keys[dedupKey]; !ok { | |
| var err error | |
| if !t.sortPerNibble { | |
| err = t.etl.Collect(hashedKeyCopy, []byte{}) | |
| } else { | |
| err = t.nibbles[hashedKeyCopy[0]].Collect(hashedKeyCopy, []byte{}) | |
| } | |
| if err != nil { | |
| log.Warn("failed to collect hashed key", "hashedKey", fmt.Sprintf("%x", hashedKeyCopy), "err", err) |
| fullPath := make([]byte, 64) | ||
| copy(fullPath[:32], tokenContract2AddrHash) | ||
| copy(fullPath[32:], hashedBalanceKey) | ||
|
|
||
| sameStoragePrefixAddresses = findAddressesWithMatchingStorageKeyPrefix(balanceStorageKeyPath, 1, 1, 1) | ||
| sameStorageKeyPath := computeMappingStorageKey(sameStoragePrefixAddresses[0], 1) | ||
| hashedSiblingKey := crypto.Keccak256(sameStorageKeyPath[:]) | ||
|
|
||
| fullPathSibling := make([]byte, 64) | ||
| copy(fullPathSibling[:32], tokenContract2AddrHash) | ||
| copy(fullPathSibling[32:], hashedSiblingKey) | ||
|
|
There was a problem hiding this comment.
This block declares fullPath / fullPathSibling but never uses them, which will fail Go compilation (unused local variables). Remove these variables or use them (e.g., in an assertion/log) if they are meant to validate trie paths.
| fullPath := make([]byte, 64) | |
| copy(fullPath[:32], tokenContract2AddrHash) | |
| copy(fullPath[32:], hashedBalanceKey) | |
| sameStoragePrefixAddresses = findAddressesWithMatchingStorageKeyPrefix(balanceStorageKeyPath, 1, 1, 1) | |
| sameStorageKeyPath := computeMappingStorageKey(sameStoragePrefixAddresses[0], 1) | |
| hashedSiblingKey := crypto.Keccak256(sameStorageKeyPath[:]) | |
| fullPathSibling := make([]byte, 64) | |
| copy(fullPathSibling[:32], tokenContract2AddrHash) | |
| copy(fullPathSibling[32:], hashedSiblingKey) | |
| sameStoragePrefixAddresses = findAddressesWithMatchingStorageKeyPrefix(balanceStorageKeyPath, 1, 1, 1) | |
| sameStorageKeyPath := computeMappingStorageKey(sameStoragePrefixAddresses[0], 1) | |
| hashedSiblingKey := crypto.Keccak256(sameStorageKeyPath[:]) |
| // Enable historical commitment to allow witness generation for historical blocks | ||
| statecfg.EnableHistoricalCommitment() |
There was a problem hiding this comment.
statecfg.EnableHistoricalCommitment() mutates the global statecfg.Schema for the entire test process. Since it isn't restored, it can leak into other tests (especially under parallel execution) and change their assumptions. Consider scoping this via a helper that saves/restores the previous schema or enabling it once in shared test setup.
| // Enable historical commitment to allow witness generation for historical blocks | |
| statecfg.EnableHistoricalCommitment() | |
| // Enable historical commitment to allow witness generation for historical blocks | |
| previousSchema := statecfg.Schema | |
| statecfg.EnableHistoricalCommitment() | |
| t.Cleanup(func() { | |
| statecfg.Schema = previousSchema | |
| }) |
| // Verify the execution witness result by re-executing the block statelessly | ||
| chainCfg, err := api.chainConfig(ctx, tx) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to get chain config: %w", err) | ||
| } | ||
|
|
||
| newStateRoot, _, err := execBlockStatelessly(result, block, chainCfg, fullEngine) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("[debug_executionWitness] stateless block execution failed: %w", err) | ||
| } |
There was a problem hiding this comment.
ExecutionWitness always re-executes the entire block statelessly to self-verify the witness (execBlockStatelessly). This roughly doubles the execution cost/latency of the RPC call and may be prohibitive for prover workloads that call it frequently. Consider making verification optional (build tag, debug flag, or env/config toggle) or sampling it in tests instead of always doing it in production.
address CodeReview comments in #20205
Implements the
debug_executionWitnessRPC endpoint used by zk provers.Closes #18290