StateCache LRU + Mode rework (PR #2 of the perf stack) by mh0lt · Pull Request #21386 · erigontech/erigon

mh0lt · 2026-05-24T11:20:46Z

This PR ships the execution/cache LRU/Mode rework + the StateCache population commits as a follow-on to PR #21380 (State Cache Consolidation). The LRU/Mode rework was always meant to ship separately so the policy change can be reviewed independently of #21380's BranchCache work.

Important

Stacks on #21380. Base is mh/perf-caches-pr, NOT main. Merge order: #21380 → this PR.

Important

Do not merge until CI is green on both parallel and serial.

Scope — 11 commits cherry-picked from `mh/all-stack`

sha (rebased)	source	subject
`cb4443bf51`	`fba4ce8999`	`execution/cache, db/state/execctx`: SD-transparent ethHash bypass for CodeDomain
`d75ec41fcd`	`7d0998d0db`	`execution/cache, db/state, execution/state`: codeSizeCache for EXTCODESIZE / EXTCODEHASH
`77cf879d9a`	`cbe9044e52`	`execution/exec, execution/execmodule`: BlockReadAheader populates cache.StateCache
`67297a5dfe`	`f2d4c3df74`	`execution/state, execution/cache`: stateObject.code populate + addrToHash LRU
`cca736e34d`	`7c3e054063`	`execution/cache, db/state/execctx`: addr → codeHash LRU above SD
`2a21a81608`	`c8f10544c0`	`execution/exec`: cachePopulatingGetter caches negative results
`2eea7d2c61`	`d01a345062`	`execution/cache`: surface fill-and-freeze cliff via inserts/dropped counters
`576c5ade3e`	`8052c84831`	`execution/cache`: replace GenericCache map with sharded LRU + Mode
`8e239f3518`	`6b785d4360`	`execution/cache`: STATE_CACHE_MODE env override at NewStateCache time
`ad9f74c897`	`c55128565a`	`execution/cache`: correct the LFU rationale in Mode docstring
`266e2979bd`	`f80655f6d2`	`execution/cache`: reduce default cache caps to 100 MB each (bench knob)

One commit deferred

The 12th commit on the original handoff list — 66bcc44702 (BAL-driven BlockStateCache prewarm) — has been dropped from this PR because it depends on the execution/balcache package, which is introduced by PR-A (eth/71 BAL wire protocol) off main. It will be reintroduced as a small follow-up PR once both this PR and PR-A have merged.

🤖 Generated with Claude Code

…CodeDomain Adds a third map (`ethHashToCode`) to CodeCache, keyed by the 32-byte Ethereum codeHash (keccak256). New methods `GetByEthHash` and `PutWithEthHash` expose direct L2b access without going through the addr→maphash→code two-level path. The byte storage duplicates L2 in the worst case (2x code-bytes memory at the cap); accepted for the per-key fast path on many-addrs-one-code workloads. `SharedDomains.GetLatest(CodeDomain, ...)` consults L2b transparently: when the addr-keyed cache misses, resolve the codeHash from the AccountsDomain (typically warm because the EVM just loaded the account), probe `stateCache.GetCodeByHash` before falling through to the file accessor stack. On miss, fill both L1 and L2b via PutCodeWithHash. The fast path is unchanged. Workload shape this targets: many addresses sharing one codeHash (proxies, factory-deployed clones, ERC-20 holders, OpenZeppelin templates). Today's addr-keyed cache misses on every fresh address even when the bytecode is already cached. With this change a single L2b entry serves N addresses after the first population. Microbench results: - BenchmarkCodeCache_GetByEthHash_Hit: 17.01 ns/op - BenchmarkCodeCache_GetByEthHash_Miss: 15.45 ns/op - BenchmarkCodeCache_Get_AddrLevel_Hit: 32.44 ns/op (existing) - BenchmarkCodeCache_GetByEthHash_ManyAddrs: 17.02 ns/op L2b hit is ~2x faster than the existing two-level addr path (one map probe vs two), and enables hits on workloads where L1 would miss. Cross-client research at agentspecs/cross-client-state-access-2026-05-14.md notes geth's separate codeSizeCache as the further (geth-proven) win for EXTCODESIZE/EXTCODEHASH and addrToHash LRU as a one-line behaviour fix; both queued as follow-up surgical commits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…SIZE / EXTCODEHASH Adds a third caching layer to CodeCache (alongside L1 addr→maphash and L2b ethHash→bytes): codeSizeByEthHash maps the 32-byte Ethereum codeHash to its byte length. Tiny per-entry footprint (32B key + 8B value vs 5-10 KB for full bytes) so the same memory budget gives ~1000x the hit surface. Capped at 1M entries (geth core/state/database_code.go uses the same size). EXTCODESIZE / EXTCODEHASH callers — historically the slowest opcodes on the lab dashboard's bench — answer from a single map probe without paying the file accessor stack cost of the full bytes. Geth-proven; cross-client writeup at agentspecs/cross-client-state-access-2026-05-14.md notes this as the largest single available win for the synthetic bench. Wiring: - CodeCache.GetCodeSizeByEthHash / PutCodeSizeByEthHash — direct accessors. - PutWithEthHash now populates the size layer alongside L2b, so every bytes-load creates a future fast-path entry "for free". - StateCache wrappers GetCodeSizeByHash / PutCodeSizeByHash. - SharedDomains.GetCodeSize(tx, addr) — the SD-transparent fast path: resolve codeHash via the AccountsDomain cache chain, probe the size cache, then L2b, then file-read+populate. Returns (0, false, nil) for EOAs and no-code accounts without paying any file read. - temporalGetter.GetCodeSize so callers reach it via the existing getter. - ReaderV3.ReadAccountCodeSize type-asserts on a codeSizeGetter interface and routes through the fast path when the underlying getter supports it; falls back to GetLatest+len otherwise. No kv.TemporalGetter interface change. Limitation: capacity is no-op-when-full, not LRU. A separate surgical commit will swap to real LRU eviction; mirrors the addrToHash fix queued from the same cross-client writeup. Tests: 3 new (PopulatedAlongsideBytes, DirectPutAndGet, EmptyHashOrNegativeIsNoOp). All existing CodeCache tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…e.StateCache The BlockReadAheader has always prefetched BAL-listed (and access-list) addresses' account/code/storage via a fresh ReaderV3 on a separate RoTx. Its prefetches warmed OS page cache + RoTx cursors — disconnected from the process-global cache.StateCache that SharedDomains.GetLatest probes on the EVM hot path. The two layers were two separate caches; nothing the prefetcher loaded ever reached the EVM's lookup path. Reth's structural advantage on EXTCODESIZE-loop benches is that its prewarm writes to the same hashmap the EVM reads from (crates/engine/execution-cache/src/cached_state.rs:663). When EVM enters, every BAL-listed addr's first read is a 20 ns cache probe — no file accessor stack, no decompression CPU. PR #21128 swapped this from mini-moka to a lock-free fixed-cache for a measured +10.8 % mgas/s. This commit closes the equivalent gap on Erigon: a thin cache-populating TemporalGetter wrapper writes successful reads through to cache.StateCache as a side effect. ReaderV3 is unchanged; the wrapper sits in front. When the prefetcher already has the codeHash from a preceding account read, the next CodeDomain read routes through StateCache.PutCodeWithHash so the L2b (ethHash → bytes) + size-cache layers fill alongside the bare addr-keyed L1. Wiring: - BlockReadAheader.SetStateCache(*cache.StateCache) setter. - ExecModule construction calls readAheader.SetStateCache(domainCache), so the same StateCache the FCU/canonical path wires onto SD is the one the prefetcher warms. - cachePopulatingGetter wraps the worker's ttx; both BAL-warming and tx-warming paths gain the same treatment. Fgprof on the EXTCODESIZE-EXISTING_CONTRACT-30M bench had shown 95 % of EVM wall-clock in seg.Getter.nextPos (Huffman decompression of code values). With this commit, every BAL-listed addr's lookup should hit the cache and skip the file accessor stack entirely — eliminating the dominant cost. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ash LRU Two surgical commits bundled (both touch the code-read hot path): 1. IntraBlockState.GetCodeSize now loads the full bytes via stateReader.ReadAccountCode on first touch and populates stateObject.code, so subsequent same-addr EXTCODESIZE / EXTCODEHASH / CALL within the tx are in-struct slice-len calls (~50 ns), not full reader round-trips. Mirrors geth's pattern at core/state/state_object.go ~Code() — pay one read per addr per tx, free for the rest. 2. CodeCache.addrToHash switched from a no-op-when-full maphash.Map[versionedAddressID] to an LRU lru.Cache[[20]byte, versionedAddressID] (hashicorp/golang-lru/v2, already imported elsewhere). Cap derived from the existing byte budget at ~28 bytes/entry (~580 k entries for the 16 MB default). Fresh-address workloads (mainnet thousands of new addrs per block) now warm up the addr layer over time instead of silently dropping new entries forever; matches geth's lru.Cache at core/state/database_code.go. The hashToCode layer is unchanged (content-addressed bytes, immutable, byte-capped with new-entry no-op when full — the same semantic as before since code bytes by codeHash never change). Bench on the EXTCODESIZE-EXISTING_CONTRACT-30M family: 62.34 mgas/s (was 61.50). The marginal gain is small on this bench because BAL prefetch already populates the cache layers; neither lever fires heavily. The expected wins are on non-BAL workloads where EXTCODESIZE-loop patterns repeat within a tx (#1) and fresh-address-churn mainnet blocks fill the addr layer (#2). Updated TestCodeCache_AddrCapacityLimit to assert LRU eviction (was asserting no-op-when-full); the prior behaviour was the bug. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Nethermind-style addr → 32-byte codeHash LRU sitting above SharedDomains.codeHashForAddr. When the EVM-known codeHash for an address has already been resolved once, subsequent lookups skip the entire AccountsDomain chain (sd.mem → sd.parent.mem → sd.stateCache → tx.GetLatest) and the account-RLP decode. Wiring: - CodeCache adds addrToEthHash *lru.Cache[[20]byte, [32]byte] sized to the existing addrCapacityB budget; methods GetAddrCodeHash / PutAddrCodeHash / DeleteAddrCodeHash. - StateCache wrappers route to the CodeCache instance. - SD.codeHashForAddr probes the LRU first; on miss falls through to the existing chain and populates on the way out (including the zero-hash sentinel for missing-or-EOA accounts — repeat lookups return immediately). - Invalidation: SD.DomainPut for AccountsDomain drops the entry (CREATE / CREATE2-replace path); SD.DomainDel for AccountsDomain also drops the entry (SELFDESTRUCT); StateCache.RevertWithDiffset drops on unwind. Helps non-BAL workloads where codeHashForAddr is currently the cold account-domain probe. On the EXISTING_CONTRACT bench (BAL prefetch already populates everything), this is within noise; the lever is for mainnet workloads where many addresses miss the BAL-prefetch window but the cache is warm from prior lookups. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The cache-populating wrapper on the read-ahead worker's TemporalTx previously gated cache writes on `len(v) > 0`. That dropped negative results — i.e. missing accounts, empty storage slots, no-code probes — on the floor. Repeated probes of the same missing address re-paid the file accessor stack walk every time, instead of hitting a cached negative entry. Mirrors the revm pattern that drives reth's 1700-3400 mgas/s on account_access NON_EXISTING / EXISTING_EOA variants: revm represents a missing address as a real CacheAccount{ account: None, status: LoadedNotExisting } and reth's ExecutionCache.account_cache uses FixedCache<Address, Option<Account>> where None is a first-class cacheable value. Bottom of the reth path is: BAL prewarm calls basic_account once → returns None → cache hit forever for that addr. The cycle-2 sweep on account_access[EXTCODESIZE/NON_EXISTING/30M] showed 3.65 → 494 mgas/s without this fix; with the fix the same bench reports 508 mgas/s (within run-to-run noise but trending right). Most of the win was already captured by the readAhead-populates- cache.StateCache wiring (commit cbe9044) and the balcache port (d41e2e8) — those raised the cache hit rate on populated entries enough that the EVM rarely fell through to the file accessor on this bench. The fix is mechanically correct regardless and should matter more on workloads with mixed populated / negative probes across blocks. See agentspecs/reth-missing-eoa-fastpath-2026-05-15.md for the detailed mechanism analysis and the three concrete copy-able patterns from reth. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…unters GenericCache.Put has no eviction policy. When the byte budget is reached, new keys are silently dropped until Clear/ClearWithHash/ValidateAndPrepare- mismatch resets the cache. On a long-running node this manifests as a monotonic miss-rate climb that's hard to attribute without instrumentation. Add two counters next to hits/misses: inserts - new keys accepted dropped - new keys rejected at the budget check (the existing branch at the new-key cap; not a behaviour change) PrintStatsAndReset logs both. Sets up the diagnostic baseline before the eviction-policy swap in the follow-up commits on this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the maphash.Map[T] backing store in GenericCache with freelru.ShardedLRU[uint64, entry[T]] (same lib as db/state/cache.go; already in go.mod). Adds a Mode constructor flag: - ModeEvictLRU (default): per-shard LRU evicts the oldest entry on insert when its slot cap is reached. OnEvict drops bytes from currentSize. - ModeNoOp: preserves the historical fill-and-freeze behaviour (silently drop new keys at the byte cap; counted via dropped). Kept as the diagnostic baseline so the regression bench can compare A/B. Per-shard eviction is a known trade-off of freelru.ShardedLRU — RemoveOldest is shard-local, not globally LRU. Matches the trade-off db/state/cache.go / execution/cache/code_cache.go / execution/balcache/balcache.go already accept. LFU (W-TinyLFU, the policy reth uses) is scan-resistant by design and would slot in behind the same Mode wrapper as a follow-up; the seam is documented at policy.go. Key shape: pre-hash via common/maphash.Hash (Go's randomized stdlib hasher, already used by the previous maphash.Map) into uint64; entry stores the full key for collision check. Same pattern as db/state/cache.go. Byte-budget translation: per-domain avg-entry constants in state_cache.go (avgAccountEntryBytes / avgStorageEntryBytes / avgCommitmentEntryBytes) — account / storage are near-fixed sizes so the translation is reliable. capacityBytes becomes a sizing hint plus telemetry (SizeBytes / PrintStatsAndReset). Code domain is unchanged; CodeCache wraps its own LRUs. Adds metrics: inserts, evictions, dropped — all exposed in PrintStatsAndReset alongside the existing hits / misses / hit_rate. Mode is also logged. Touches one external call site: execution/vm/contract.go's jumpDestCache now constructs with ModeEvictLRU. Tests: TestDomainCache_PutCapacityLimit renamed to ..._NoOpMode and asserts the fill-and-freeze contract under explicit ModeNoOp. New TestDomainCache_PutEvictsWhenFull_EvictMode asserts eviction under ModeEvictLRU using a small entry-count cap (the byte→entry translation is approximate; the test uses the entry-count knob via the in-package newGenericCacheEntries constructor to make the assertion deterministic). Pre-existing lint issues on mh/sd-code-cache (intra_block_state.go nilness, preload_parallel.go prealloc) are surfaced by lint non-determinism but are out of this commit's scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Single env knob read once at NewStateCache. Default ModeEvictLRU, recognised override "noop" (for the regression-bench baseline so ModeEvictLRU and ModeNoOp can be compared on the same binary). Unrecognised values fall back to evict with a warn log. ModeNoOp engagement is logged at info level because the fill-and-freeze behaviour is a deliberate diagnostic state, not a production setting. Pattern matches db/state/cache.go's D_LRU_ENABLED / D_LRU knobs (dbg.EnvString from common/dbg). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous comment asserted "reth uses W-TinyLFU for state caches" — that is wrong on the execution hot path. Reth's cross-block state cache is `fixed-cache` (PR #21128, v1.11.0): a lock-free direct-mapped / set-associative array with collision-evict semantics. No LRU list, no LFU sketch. Their published wins (~25% newPayload p50 / +33% gas/s) came from *removing* LRU/LFU bookkeeping, not adding LFU. Where reth uses real LRU/LFU it's deliberate and not the execution cache (schnellru::LruMap for networking; moka in precompile_cache.rs explicitly configured with eviction_policy(EvictionPolicy::lru())). The docstring now reflects two follow-up policies both real: - ModeEvictFixedCache (reth's actual choice, more interesting structural option than LFU) - ModeEvictLFU (W-TinyLFU; helps mainnet steady-state, not the cycle-2 bloat fixtures which are pure cold scans) Decision criterion (per agentspecs/lfu-vs-lru-state-cache-decision-2026-05-15.md): ship ModeEvictLFU only if a 24h mainnet replay shows current sharded-LRU hit-rate < 90 % on Account/Storage. Otter is the only credible Go W-TinyLFU library; ristretto has documented correctness bugs and is disqualified for an EL hot path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Investigation knob, NOT a permanent default. Account / Storage / Code each capped at 100 MB so the bench measures layer contributions instead of being dominated by preallocated cache memory pressure (1 GB / 1 GB / 512 MB defaults push sys past the GC/page-cache pressure band on this hardware/workload mix). Permanent defaults stay at 1 GB / 1 GB / 512 MB; this commit will be reverted or dynamically gated by relative-to-available sizing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

This PR ships the parallel-exec correctness fixes from `mh/parallel-exec-fixes` onto the perf stack, packaged as a focused PR on top of [#21386 (StateCache LRU)](#21386) which itself stacks on [#21380 (State Cache Consolidation)](#21380). > [!IMPORTANT] > **Stacks on #21386 → #21380.** Base is `mh/perf-statecache-lru-pr`, NOT `main`. Merge order: #21380 → #21386 → this PR. > [!IMPORTANT] > **Do not merge until CI is green on both parallel and serial.** Same gating rule as #21380 / #21386. ## Scope — 13 commits from `mh/parallel-exec-fixes` Brought in via a merge commit so the bisection trail is preserved. | sha | what it fixes | |---|---| | `25053e38e9` | parallel SD-of-pre-existing-contract — the 197-line foundational fix | | `2e2bf3ccc0` | clean exit when single-block batch already covered maxBlockNum | | `6e451f5ed2` | don't emit StoragePath=0 writes from IBS.Selfdestruct | | `616a4fa0a8` | clear calc Deleted on a non-SD account write even when zero | | `d99f2f704d` | gate known parallel-exec failures behind EXEC3_PARALLEL (#21136) | | `34e83e82b7` | install per-block changeset accumulator before any of the block's writes | | `b340d7e592` | drop stale sd.mem 'Trim old version entries' comment | | `629cc23566` | O(1) CollectorWrites fee-balance update, drop dead VersionedWrites.SetBalance | | `a0ecfc7e12` | first-match-wins in CollectorWrites BalancePath index | | `445f97e446` | emit EIP-7708 Burn log under parallel-exec when coinbase self-destructs | | `5e1f5fa901` | mirror ReadAccountData SD-revival check into versionedRead | | `a5dc83f509` | drop two stale EXEC3_PARALLEL t.Skips | | `8af901104f` | drop TestReceiptHashFromRPC unit-suite RPC integration test | ## Merge conflicts resolved 3 files, 8 regions — all resolved by keeping HEAD's typed-readset / per-path revival shape and confirming HEAD already absorbs each fix's intent. See the merge commit message (`cfc4ec1418`) for the per-region rationale. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Mark Holt <erigon@dev-bm-e3-ethmainnet-n4.erigon.io>

…statecache-lru-pr

…govet) stateObject and s are both verified non-nil earlier in their respective scopes; the secondary checks at lines 749 and 783 are redundant. govet nilness check fails on these.

codeHashForAddr resolves an account's codeHash from the AccountsDomain so the CodeDomain ethHash bypass can serve shared bytecode without an addr-keyed file read. decodeAccountCodeHash decoded the account value with acc.DecodeForStorage, but AccountsDomain values are SerialiseV3-encoded. DecodeForStorage is the legacy MDBX bitmask format with an incompatible binary layout; applied to V3 bytes it silently misparses and leaves CodeHash empty. As a result codeHashForAddr returned nil for every account and the ethHash bypass never engaged for any contract — every CodeDomain read that missed the addr cache fell through to a file read. This is a decoding-correctness bug: the wrong decoder is applied to the stored encoding. Use accounts.DeserialiseV3, the matching decoder for AccountsDomain values. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

sudeepdino008 · 2026-06-05T01:08:12Z

Deterministic gas-used mismatch at mainnet chain tip with state cache on (parallel exec)

Hit a reproducible invalid-block failure while running this branch (fd74979033) on a mainnet --prune.mode=minimal node at chain tip.

Symptom

Parallel catch-up execution (initialCycle=true, FCU path) fails the gas-used check, with execution computing less gas than the header:

invalid block, block=25246401, gas used by execution: 36458218, in header: 36468327   (Δ = −10,109)
  hash=fa7171beb846755c1f3f00d367bc3f6ea76db30513b2a6f4dd2b70f31ab90d88
invalid block, block=25246414, gas used by execution: 18322595, in header: 18366878   (Δ = −44,283)
  hash=a5c088ba8ec28f5007b15567e152e4a7216221b203942a7e2eb4f793f651c4a9

Fully deterministic: the FCU loop retried block 25246401 111 times and 25246414 114 times, producing identical wrong gas values on every retry. Failure stack: exec3_parallel.go:281/656 → exec3.go:264 → stage_execute.go:395 → forkchoice.go:507.

Environment

branch head fd74979033 plus one local commit adding read-only Prometheus counters on the read path (present in both the failing and the passing runs below, so not a factor)
mainnet, minimal datadir freshly synced same day, at tip
EXEC3_PARALLEL=true, 10 workers, --exec.state-cache=true (defaults otherwise)

Control experiment

Restarting the same binary on the same datadir with --exec.state-cache=false executed straight past both blocks on the first attempt and reached tip. For reference, a pre-this-branch build (main from ~Jun 3) ran all day at tip with the state cache enabled and parallel exec without a single invalid block.

So: state cache ON + parallel ⇒ deterministic wrong execution; state cache OFF ⇒ correct.

Notes / hypothesis

Gas is under-counted while the block otherwise fails at the gas check (not a state-root mismatch first), which points at a stale value flowing into a gas-sensitive path — SSTORE current/original-value pricing, or the new CodeDomain L2b hash-bypass / GetCodeSize fast path feeding stale code bytes/size. Both failing blocks were executed inside a multi-block catch-up batch, not single-block tip-following, if that helps narrow the window.

Happy to provide the datadir state, full logs, or run candidate fixes against the same node — currently digging into per-tx gas divergence on an unwound copy.

sudeepdino008 · 2026-06-05T03:41:16Z

Root cause of the gas-mismatch / wrong-trie-root failures: L2b bypass breaks `DomainPut`'s prevVal read on EIP-7702 delegations

Follow-up to my previous comment — fully bisected and root-caused with an offline repro (integration stage_exec with the state cache attached + unwind; fails deterministically at the same block with identical wrong root across 1/10/20 workers, passes with cache off).

Bisection

toggle	result
full state cache	❌ wrong root (also seen as gas-used mismatch at tip)
disable accounts / storage caches	❌ still fails
disable code cache	✅ passes
disable `GetCodeSize` fast path only	❌ still fails
disable the CodeDomain L2b bypass only (`SharedDomains.GetLatest`)	✅ passes

Smoking gun

An assert inside the L2b bypass comparing against the authoritative read fired:

L2b divergence: addr=042201a835f9ab04bb098dee1756bb8a26a2e068
resolvedCodeHash=8b38194773e4314f48b6d8e1c5aef93b68fedc5a427c5e90df8b3f5f68873542
cachedLen=23 dbLen=0
cached=ef010027dbd0e71b85700e29994d6d3a51f2e32442aa61...   ← EIP-7702 delegation designator
stack: ... domain_shared.go (L2b in GetLatest) ← DomainPut prevVal read ← apply loop

Mechanism (EIP-7702 lost-write)

Authority X delegates to delegate D earlier in the batch → L2b caches keccak(0xef0100‖D) → designator bytes. That mapping is immutable and shared by every authority delegating to D.
Authority Y delegates to the same D later in the batch. Apply order: Y's account record is written first (codeHash = designator hash), then DomainPut(CodeDomain, Y, designator).
DomainPut reads prevVal via sd.GetLatest(CodeDomain, Y): mem misses (code not yet written) → L2b bypass resolves the codeHash from Y's freshly written account record (mem hit in codeHashForAddr) → GetCodeByHash hits via X's entry → returns the new designator as the "previous" code. Authoritative prev is empty.
The no-change short-circuit (bytes.Equal(prevVal, v)) silently drops the CodeDomain write. Y's designator never lands.
Result: wrong trie root, or — when a later tx calls through Y's delegation — wrong execution and the gas-used mismatches reported above.

This explains the intermittency (requires ≥2 authorities delegating to the same delegate within one exec batch — common at tip, not universal) and the full determinism once a qualifying window exists.

Fix directions

The L2b shortcut is only safe for pure reads of committed state; on the DomainPut/DomainDel prevVal path it can observe the same tx's in-flight account write and time-travel the answer. Options:

prevVal reads in DomainPut/DomainDel use an internal getLatest variant that skips the L2b bypass (smallest, targeted);
codeHashForAddr refuses to resolve from sd.mem/parent-mem (only stateCache/db layers) — keeps the bypass for hot committed accounts, loses it for in-flight ones;
writers pass prevVal explicitly for CodeDomain puts where known.

Option 1 seems strictly correct: the prevVal contract is "value before this write", which the hash-resolved shortcut cannot guarantee once the account record has already advanced.

Repro recipe (offline, ~3 min): patch integration stage_exec to attach cache.NewDefaultStateCache() to the SD, unwind ~90 blocks on a recent mainnet datadir, re-exec. Happy to share the exact patch/assert or test a candidate fix against the captured window.

…nchCache + stateCache The optimization stack and the cache branch had grown two divergent FlushWithCallback methods — one single-domain for the BranchCache (commitment), one all-domains for the StateCache flush-only fix. Merge them into one. FlushWithCallback is now all-domains: it invokes cb for every (domain, key, latest-value, step) across all domains (sd.domains + sd.storage), then drains the mem-batch — callback, MDBX flush and drain in one latestStateLock window. SharedDomains.Flush passes a single callback that routes CommitmentDomain → BranchCache and Accounts/Storage/Code → StateCache. The per-write stateCache.Put/Delete in domainPut/DomainDel are removed: a write is in-flight, fork-specific state living in sd.mem; mirroring it into the process-wide cache let a sibling fork's re-execution read another fork's uncommitted bytes. The cache is now refreshed only on flush, so it mirrors committed, fork-agnostic state. drainLocked empties sd.mem as part of the flush so a child SD chained as parent reads through to the refreshed cache / DB instead of stale bytes. This folds the cache fix (was 527cb23077 on the cache branch) into the optimization stack so the local group-test exercises the final code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

mh0lt · 2026-06-05T13:57:27Z

Pushed 355e25f8a0 — execution/cache, stagedsync: don't advance state-cache blockHash during fork-validation.

ci-gate does not auto-run on this PR (it triggers only for base main/release/**/performance, and this PR targets mh/perf-caches-pr), so I manually dispatched the full CI Gate suite (lint, tests, race-tests, eest-spec-tests, hive, hive-eest, caplin, kurtosis, bench, repro) on this branch to get consensus coverage for the change — the state cache feeds execution reads, so EEST/hive matter here.

Dispatched run: https://github.com/erigontech/erigon/actions/runs/27019158872

Add a Collector, owned by the Aggregator (process lifetime), that aggregates KV-read metrics across every read path. Producers fill their own *DomainMetrics lock-free and hand ownership over a buffered channel tagged with a Source (exec/commitment/warmup/rpc/engine); a single collector goroutine folds them into map[Source]*DomainMetrics with no lock or atomics on the aggregate. The goroutine also self-publishes source-labelled Prometheus gauges (kv_read_count / kv_read_duration_ns, labels {source,domain,op}) on a ticker — process-level and independent of whether a block is executing. Wiring: - Aggregator owns the Collector: Start in newAggregator (on a.wg), Stop+drain in Close before wg.Wait. Exposed to SharedDomains via the duck-typed kvmetrics.MetricsCollectorProvider on *AggregatorRoTx (same pattern as BranchCacheProvider), so the leaf kvmetrics package stays cycle-free. - SharedDomains.MergeMetrics(source, wm) now hands a finished worker's metrics to BOTH sinks: the per-batch sd.metrics (under one lock, for the existing log line) and the collector (lock-free, for Prometheus). Ownership of wm transfers to the collector, so producers allocate a fresh instance afterwards. - Producers tagged: exec workers (SourceExec, per task), the commitment fold (SourceCommitment), trie warmup and concurrent mount (SourceWarmup). - AsGetterCollected(tx, source) gives concurrent short-lived callers (RPC, engine) a per-getter instance + flush closure; gated on KVReadLevelledMetrics. The new gauges are additive — the existing exec-scoped mxExec* gauges and the per-batch log line are unchanged. The memBatch put-path (CachePut*) is left on the existing shared aggregate deliberately: those counters are load-bearing for SizeEstimate's flush accounting, so moving them belongs in a separate change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add a request-scoped read accumulator to SharedDomains (StartRequestMetrics(source) / flush at Close) so a single-goroutine RPC handler that reads through the plain AsGetter is metered without a shared accumulator or a per-getter flush. getLatestMetered folds nil-metrics reads into it; this short-circuits for exec workers (which pass their own per-worker instance), so there is no cross-goroutine access to the request accumulator. Wire eth_simulation's SimulateV1 to tag its reads as SourceRPC. Engine block execution is already metered as SourceExec (it runs through the exec workers); SourceEngine and the per-read-getter paths (exec_module CacheView, vm/runtime, which build a getter per read) are left for a follow-up that needs a view-level accumulator. Also make Collector.Snapshot drain the buffered samples first so a snapshot reflects everything sent so far, and add a -race collector test covering concurrent Send + Snapshot + Close-drain. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…roducers The exec hot path must never block on metrics and must never lose counts. A buffered channel alone satisfies neither at the boundary: a full buffer blocks a plain send, and a non-blocking send drops. Resolve both with retain-on-full. - Collector.TrySend is non-blocking and returns whether the sample was queued. Exec workers keep a retained accumulator (collectorAcc): each task folds its reads in and TrySends; on a full buffer the send is skipped and the worker keeps adding to the same accumulator, retrying next task. A single blocking flush at worker exit drains the remainder (off the hot path, lossless). So execution never waits on metrics and no count is dropped. - Collector.Send is the blocking variant for low-frequency boundary producers (commitment fold, warmup teardown, an RPC request closing) where a rare brief wait is fine and losing the sample is not. - SharedDomains.LogMergeMetrics folds a task into the per-batch sd.metrics log aggregate only; the exec path uses it each task and feeds the collector separately via the retained accumulator, so the two sinks (which reset on different conditions) never double-count. - Dropped the drop counter and its gauge — nothing is dropped now. Validated: build, make lint (0 issues), go test -race ./db/state/kvmetrics (incl. a test proving TrySend never blocks on a full buffer), and hive cancun with KV_READ_METRICS=true at the pinned ref = 226/0. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

sudeepdino008 · 2026-06-08T09:13:50Z

#21675 -- found an issue while running/testing this branch

Ripples the perf-caches-pr main merge (+ #21380 review fixes) up the stack. Resolved 5 conflicts by keeping statecache-lru's newer model where it has evolved past perf-caches-pr: - BranchCache: kept the txNum/epoch unwind model (Unwind + epoch invalidation) over perf-caches-pr's step/txN/UnwindTo model; ported the PutIfClean peek fix (avoid write-path miss-accounting). Removed the now-dead BranchCache.UnwindTo and converted the residual 5-arg PinEntry call (preload_parallel.go). - domain_shared.go: kept the lock-free wm metrics path + epoch-stamped branch Put + ClearBranchCache/DetachBranchCache; added the statecfg import for PickTrieVariant. generic_cache.go kept the freelru LRU. exec3_parallel.go kept AsGetterNoMetrics. - Restored statecache-lru's branch_cache_test.go / trunk_pin_test.go (they test the epoch model; the auto-merge had pulled perf-caches-pr's step/txN tests). Validated: go build ./...; commitment + execmodule reorg/fork gate + stateCache tests; make lint clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Ripples the statecache-lru main merge up. Resolved 4 conflicts by combining the metrics-collector additions with the rippled cache/commitment changes: - domain_shared.go: keep the kvmetrics MetricsCollectorProvider lookup + statecache-lru's new TrieConfig commitment-context ctor (cfg, branchCache). - aggregator.go: keep the kvmetrics collector init + statecache-lru's oldestVisible; the MeteredGetLatestWithTxN/getLatestWithTxN methods now take *kvmetrics.DomainMetrics (the metrics relocation changeset -> kvmetrics). - commitment_context.go: Metrics() returns *kvmetrics.DomainMetrics. - eth_simulation.go: keep StartRequestMetrics(SourceRPC); the defer toggle is now sharedDomains.SetDeferCommitmentUpdates(false) (renamed in the refactor). Validated: go build ./...; commitment + kvmetrics + execmodule reorg tests; make lint clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…" nickname "ethHash" collided with Ethash (the proof-of-work algorithm) and obscured that it is the keccak *code* hash. Rename the codeHash-keyed code-cache layer and its API so the name says what it is: ethHashToCode -> codeHashToCode GetByEthHash -> GetByCodeHash PutWithEthHash -> PutWithCodeHash codeSizeByEthHash -> codeSizeByCodeHash GetCodeSizeByEthHash -> GetCodeSizeByCodeHash ethHashCodeSize/Hits/Misses -> codeHash... local codeEthHash -> codeHash The "L2b" tier nickname in comments/labels becomes codeHashToCode (it is a content-addressed codeHash→code map, not a cache depth-level). Values and behaviour are unchanged — mechanical rename. Test file renamed to code_cache_codehash_test.go. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…or (#21663) Stacked on #21386 (the state-cache PR); this is the metrics-only change. ## What Replaces the per-task lock-merge of KV-read metrics with a **process-level, channel-fed collector** owned by the Aggregator, so every read path contributes (not just block execution) and the metrics carry a `source` label. ## Why `changeset.DomainMetrics` was execution-bound: only exec workers and the commitment fold (during exec) produced metrics, folding into a SharedDomains-scoped aggregate under a per-task lock. RPC/engine `AsGetter` reads collected nothing, so "KV read IO" really meant "IO during block execution." This makes IO observable process-wide, lock-free on the aggregate, and broken out by subsystem. ## Design - **New leaf package `db/state/kvmetrics`** — relocates `DomainIOMetrics`/`DomainMetrics` + ctx helpers out of `db/state/changeset` (they never belonged there) and adds the `Source` enum, the `Collector`, and a shared `LogMetrics(level, source, detail)` formatter. Imports only `kv` + stdlib + the metrics façade → no import cycle. - **Collector** owned by the Aggregator (process lifetime): `Start` on `a.wg`, `Stop`+drain in `Close`. A single goroutine folds `{source, metrics}` samples into `map[Source]*DomainMetrics` with **no lock/atomics on the aggregate**, and self-publishes `source`-labelled Prometheus gauges (`kv_read_count` / `kv_read_duration_ns`, labels `{source,domain,op}`) on a ticker — additive to the existing `mxExec*` gauges. Reached from SharedDomains via the duck-typed `MetricsCollectorProvider` on `*AggregatorRoTx` (same pattern as `BranchCacheProvider`). - **Never block, never drop.** Exec workers retain an accumulator and hand it off with a non-blocking `TrySend`; on a full buffer they keep adding and retry next task, with one blocking flush at worker exit. Boundary producers (commitment fold, warmup teardown, RPC request close) use the blocking `Send` (off the hot path, lossless). - **Sources**: exec, commitment, warmup, and RPC (`eth_simulation`'s `SimulateV1`, via a request-scoped accumulator flushed at Close). Engine block execution is already covered as `exec` (it runs through the exec workers). - The per-batch **log line** (`sd.metrics`) is kept unchanged via `LogMergeMetrics`. ## Deliberately scoped out (follow-ups) - **memBatch put-path (`CachePut*`)** stays on the existing shared aggregate: those counters are load-bearing for `SizeEstimate`'s flush accounting, so moving them belongs in a separate change. - **`SourceEngine` and per-read-getter paths** (`exec_module` `CacheView`, `vm/runtime`) build a getter per read and need a view-level accumulator to meter — left for a follow-up. ## Verification - `make erigon` / `go build ./...` (the import-cycle gate), `make lint` 0 issues. - `go test -race ./db/state/kvmetrics` — incl. a test proving `TrySend` never blocks on a full buffer, concurrent Send+Snapshot+Close-drain, and correct grouped folding. - **hive `ethereum/engine` cancun with `KV_READ_METRICS=true` at the CI-pinned hive ref = 226/0.** 🤖 Generated with [Claude Code](https://claude.com/claude-code)

…bsystem Remove the contract trunk-pin preload and the adaptive pin controller from the consolidation PR so the BranchCache core (root slot + LRU tail) and the #21138 fix can land independently. The subsystem carries a consensus blocker (immortal txN=0 pins survive UnwindTo yet cache mutable MDBX state -> wrong root on reorg) plus most of the review majors, and needs its own benchmarks and an opt-in flag. Deleted: adaptive_pin.go, preload.go, preload_parallel.go, preload_ranges.go, trunk_pin_metrics.go and their tests. Removed the BranchCache pinned tier (PinEntry/PinnedCount/PinnedStats/TryClaimPreload/MissCallback/onMiss and the ContractHashFromPrefix helper) and the SharedDomains controller wiring (adaptivePinController, triggerTrunkPreload, EnableParaTrieDB preload+Bind). Kept the root slot, LRU tail, and EnableParaTrieDB core. The subsystem is re-added on mh/branch-cache-trunk-pin for re-implementation with watermark txN tagging, same-prefix tail eviction, and an opt-in flag, to land after the state-cache PR (#21386). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Contract storage-trunk pinning + the adaptive pin controller, extracted from the consolidation PR (#21380) to be landed on its own after the state-cache PR (#21386). This branch re-adds the subsystem unchanged on top of the clean BranchCache core as a single additive commit, as the base for re-implementation. KNOWN BLOCKER to fix before this lands (do NOT ship as-is): - PinEntry tags pins with txN=0, which UnwindTo treats as immortal, yet the pinned bytes come from mutable MDBX commitment state. A reorg below a pinned block can't evict the pin -> stale branch bytes -> wrong root. Fix: tag pins with the conservative watermark (step+1)*stepSize-1 so UnwindTo evicts a pin exactly when its source data is unwound (file-sourced pins get an ancient watermark and stay effectively immortal -> no perf loss); plus PinEntry must evict any pre-existing same-prefix LRU-tail entry; add a pin-then-unwind regression test; gate behind an opt-in ENABLE_ADAPTIVE_PIN. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…st; make state cache an SD detail The Nethermind-style addr→codeHash LRU is invalidated only at flush, so consulting it before sd.mem returned a codeHash stale relative to an in-batch account write (a 7702 set/clear, a selfdestruct) — a non-empty codeHash beside an empty mem-routed code read, which surfaces on re-exec as EIP-3607 "sender not an eoa" (codeHash-no-code). Route codeHashForAddr through sd.mem / parent.mem first; the LRU becomes a committed-state layer that may only answer once mem has missed. Make the state cache an SD implementation detail so no caller can consult it out of precedence: drop GetStateCache(), fold StateCache.Unwind into sd.Unwind (alongside the BranchCache unwind), and route PrintStatsAndReset through a new nil-safe sd.PrintCacheStats(). Adds TestCodeHashForAddr_InBatchAccountWinsOverStaleLRU (an in-batch account write must override a stale LRU entry; proven to fail on the prior LRU-first precedence). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…revent poisoning The content-addressed code cache (codeHash->code) was populated using a separately-read account codeHash as the key. Under parallel/speculative exec that hash can be skewed or cross-account, so a (codeHash, code) pair that doesn't satisfy keccak(code)==codeHash could enter the shared map and corrupt every account sharing that codeHash — surfacing as a wrong-forwarder gas divergence, and (once the bad bytes are persisted) codeHash-no-code / "sender not an eoa" on cold re-read. Key every content-cache entry by the code's OWN hash, keccak(code), at the single SharedDomains getter populate path and the flush callback, and bring the read-ahead prefetcher onto the same model (dropping the skewable codeHash hint). Speculative code stays in the version map and never enters the global cache; the cache is populated only on a real read through the shared domain, so a skewed account read can no longer produce a mismatched entry. Validated by a fresh mainnet resync across the former-corruption range (25.27M-25.29M): the previously-deterministic gas divergence / sender-not-an-eoa no longer reproduces, forward or on cold restart re-exec. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ters The sstoreInsert/Update/Delete/Noop and hasStorageMiss package-global atomics were incremented on the commitment/exec hot paths but their getters (SstoreClassificationCounts, HasStorageMissCount) have no consumer anywhere in the stack (#21380, #21386, the perfviz view) — write-only prototype perf-debug scaffolding. The canonical metrics framework (kvmetrics, #21663) is in main and covers KV reads, not these. Remove the counters, their Record* funcs/getters, and the call sites; this also drops execution/state's only import of execution/commitment. If SSTORE classification is wanted in production later, express it via the kvmetrics collector rather than bespoke globals. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…WriteSet normalizeWriteSet recovered an account's CodeHashPath from the versionMap (via the CodeHashPath case and the fill-missing-fields loop) but had no equivalent recovery for CodePath: the CodePath case kept the write only at the validated incarnation, and the fill loop never emitted code. A tx whose validated writeset lacked a fresh CodePath — e.g. an EIP-7702 delegating tx that re-executes, where SetCode short-circuits because so.Code() already returns the designator written by the prior incarnation (bytes.Equal(prevcode, code)) — therefore persisted a non-empty codeHash with no code bytes. A later block then read empty code for the delegated account, and the EIP-3607 sender check wrongly rejected the 7702 sender ("sender not an eoa"). Recover the code this tx wrote from the versionMap (incarnation-agnostic, scoped to this tx so a merely-touched contract's prior-tx code is not re-emitted) whenever an account has a non-empty codeHash but no code in the normalized output, mirroring CodeHashPath. Code can no longer be lost while its hash survives. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

The earlier recovery only re-emitted code found in THIS tx's versionMap (rr.Version().TxIndex == txIndex). On the real failure path that guard misses: a re-executing 7702 delegation whose code equals the already- committed designator makes IBS.SetCode short-circuit (bytes.Equal), so the validated incarnation writes no CodePath and the prior incarnation's versionMap entry is invalidated on re-exec — the versionMap holds nothing for this tx. The fill-missing loop still fills CodeHashPath from committed state, so the account persists a codeHash with no code; a later 7702 sender then reads empty code and is wrongly rejected "sender not an eoa" (observed re-executing mainnet blocks 25277235 / 25279079 / 25280960). Recover the designator from the versionMap, else fall back to the post-state via stateReader.ReadAccountCode (mirroring how CodeHashPath is recovered). Gate emission on types.ParseDelegation so only 7702 designators are re-emitted — never ordinary unchanged contract code for a touched contract (no write amplification, no callee-code misattribution). This prevents the drop during forward execution. It cannot repair state already collated into immutable snapshots with codeHash-but-no-code; that needs a snapshot unwind (separate, in development). Adds TestNormalizeWriteSet_CodePathRecoveredFromStateReader for the short-circuit/stateReader path; the existing versionMap-path test stays. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…perf-statecache-lru-pr Reconcile #21386 onto the cleaned commitment-cache model from #21380. Resolution: - branch_cache.go (+test), temporal_mem_batch.go, kv_interface.go -> take #21380's cleaned model: reduced BranchCache API, txN-watermark UnwindTo, FlushOption pattern, flush-callback-after-MDBX-write. - preload*.go, trunk_pin_test.go -> deleted (trunk-pin extracted to mh/branch-cache-trunk-pin). - domain_shared.go -> combine: keep #21386's StateCache refresh, keccak code-cache fix, codeHash->code read bypass, per-worker kvmetrics (wm) and the collector/reqMetrics fields; adopt #21380's cleaned FlushOption multi-domain callback (cb-after-MDBX-write), MeteredGetterWithTxN watermark + txN=0 skip; drop the extracted adaptive-pin controller + PublishMetrics. Notable behavioural deltas (flagged for review): - BranchCache unwind: #21386's epoch/unwindFloor model -> #21380's txN-watermark UnwindTo (same effect: evict entries above the unwind point). - StateCache flush entries now stamped with sd.txNum (batch high-water) as the unwind watermark: the cleaned WithFlushCallback exposes step, not per-key txN; sd.txNum is a safe conservative upper bound. - temporal_mem_batch.go DomainMetrics refs retargeted changeset -> kvmetrics. Carries: keccak codeHash fix, #21706 CodePath recovery, stateCache. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…h callback Make the commitment BranchCache honor the same (txNum, epoch) unwind model as the state cache GenericCache, and carry per-key txNum (not step) into flush-time cache population. Addresses #21752. BranchCache: - Add epoch + unwindFloor; stamp entries with the write txN and the epoch they were written in. Get drops a superseded-epoch entry whose txN is at/above the floor lazily on read (>= floor matches GenericCache). Unwind bumps the epoch and lowers the floor — O(1), no tail scan. Replaces the O(n) UnwindTo iterate-and-evict. Frozen (txN 0) and current-epoch entries always survive. Flush callback (kv tidy): - WithFlushCallback / FlushConfig.DomainCallbacks now deliver the value's per-key txNum, not just the step. temporal_mem_batch passes latest.txNum. - SharedDomains.Flush stamps branch and state cache entries with that per-key txNum, so unwind invalidation is tx-precise (an unwind to a txNum inside the latest step drops exactly the entries above it, not the whole step). Tests: BranchCache unwind tests rewritten for the epoch model (lazy drop, floor boundary, current-epoch survival, frozen survival). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

A contract's code value is invariant for a given hash, but its EXISTENCE is not: code deployed on a fork that is later unwound must no longer be discoverable — including by codeHash — so it can't be served as live state on the surviving fork. The code cache therefore stops treating its content layers as immutable and honors the same (txNum, epoch) model as the account/storage/ branch caches (#21752). - Every layer (addr→code, addr→codeHash, codeHash→code, maphash→code, size) carries a (txNum, epoch) stamp. Get drops a superseded-epoch entry whose txNum is at/above the unwind floor lazily on read (decrementing the byte counters). - Unwind bumps the epoch and lowers the floor — O(1), no scan — replacing the wholesale addr-layer Purge (which also nuked the whole warm working set every unwind). Re-deploying the same code on the live fork revives a stranded entry. - Thread the value's write txNum (per-key on flush, step-derived on read, a conservative sd.txNum upper bound on derived populates) through PutWithCodeHash / PutAddrCodeHash / PutCodeSizeByCodeHash and the StateCache wrappers and call sites. - Clear now hard-resets every layer (was: kept content as immutable). Tradeoff: unwinding one deployment can drop code shared with a still-live one, which is then re-fetched (a multiplicity cost) — accepted to keep stale code out of the cache. Tests cover undiscoverable-after-unwind across all layers, below-floor survival, and re-deploy revival. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Byte-match #21380's WithFlushCallback txNum doc so re-syncing this stacked branch onto #21380 attributes the kv plumbing cleanly to #21380 (#21752). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…statecache-lru-pr # Conflicts: # db/state/execctx/domain_shared.go

Port the resident fixed-array trunk onto the #21380+#21386 cache stack: an accountTrunk holding account-trie branches at nibble depths 1-4 in dense arrays (d1[16]/d2[256]/d3[4096]/d4[65536]) indexed directly by the compact-hex prefix, plus a per-contract storageTrunk (depths 0-3 + deep overflow) in a pinned map keyed by account hash. Each slot is an atomic.Pointer, so trunk reads/writes take no mutex and don't serialize through a shared lock the way the LRU tail and a storage trunk's deep overflow map do. Trunk and storage-trunk entries flow through the shared lookup/store/Invalidate walk, so Get's (txN, epoch) staleness check covers them unchanged. Adds PinEntry/PinnedCount and a SetMissCallback seam for the residency/adaptive layer (added separately). BRANCH_CACHE_TRUNK_DISABLE routes depths 1-4 back to the tail for A/B.

…sd.storage The TemporalMemBatch flush-callback loop iterated sd.domains[domain] for every domain, but StorageDomain values live in the separate sd.storage btree, not sd.domains[StorageDomain] (see getLatest/DomainPut). So the StorageDomain flush callback never fired: the stateCache's storage entries were only ever read-populated and never flush-updated. Once a slot was cached, a later write to it was invisible and the cache served the stale value on hit — surfacing under parallel exec as a swap reading a stale reserve, reverting, and producing a gas mismatch. Iterate sd.storage for StorageDomain so its flush callback fires and the cache is refreshed/invalidated on every committed slot change. TestFlush_UpdatesStorageStateCache is a deterministic regression: it writes a slot, flushes, overwrites it, flushes again, and asserts the cache reflects the second write. It fails without this change and passes with it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

… both caches) Pulls forward the minimal StateCache consistency guarantee from #21386 so this PR is logically consistent across both aggregator-lifetime caches. The StateCache write path (domainPut/DomainDel) now INVALIDATES the cache entry instead of storing the written value. The value already lives in sd.mem (the write path's local copy), so storing it in the cache both double-stored it and placed an uncommitted value into a long-lived cache — which a failed commit would leave ahead of MDBX (the same poisoning class fixed for the BranchCache). The cache now holds only committed state; reads repopulate from committed files. Consequently the ClearWithHash-on-invalid-block call is removed: an invalid block (and fork validation, which never commits) only invalidates entries, never stores wrong ones, so there is nothing to clear. #21386 will, on rebase, add warm post-commit repopulation back under its txNum/epoch model and remove the now-redundant block-hash machinery (ValidateAndPrepare/ClearWithHash). Reorg invalidation (RevertWithDiffset) and the read-through populate are unchanged here. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…onto #21380) Brings #21380's review-fix commit (f4c82e8: trimmed comments, dropped debug logs) onto #21386, plus the net-zero invalidate-on-write commit + its revert. Conflict resolutions: - db/state/execctx/domain_shared.go: reconciled the two commit-gating models into one. flushMem/Flush stay plain (cold-but-correct). Commit now stashes the flush-callback tuples for ALL caches (CommitmentDomain->BranchCache, Accounts/Storage/Code->StateCache) during flush and applies them only after tx.Commit() succeeds — extending #21380's by-construction commit-gating to the StateCache so no cache can be advanced past durable MDBX on a failed commit. Restored ProbeReadLayers (dropped by auto-merge). - db/state/changeset/state_changeset.go: kept #21386's relocation of the metrics types to db/state/kvmetrics; dropped the stale duplicate DomainIOMetrics block. - execution/commitment/branch_cache_test.go: kept #21386's Unwind-API test; dropped #21380's UnwindTo-API tests (that API was renamed/superseded in #21386). - execution/commitment/commitmentdb/commitment_context.go: kept #21386's sd interface additions (ProbeReadLayers, Metrics, kvmetrics import). - db/state/execctx/flush_storage_cache_test.go: rewrote Flush->Commit since the caches are now commit-gated (Flush no longer warms them). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

mh0lt requested review from AskAlexSharov, sudeepdino008 and yperbasis as code owners May 24, 2026 11:20

mh0lt mentioned this pull request May 24, 2026

Parallel-exec correctness fixes (PR #3 of the perf stack) #21387

Merged

Mark Holt and others added 11 commits May 25, 2026 07:28

mh0lt force-pushed the mh/perf-statecache-lru-pr branch from 266e297 to 4a512ce Compare May 25, 2026 07:29

mh0lt and others added 3 commits May 25, 2026 15:09

Merge remote-tracking branch 'origin/mh/perf-caches-pr' into mh/perf-…

b7d89ad

…statecache-lru-pr

execution/state: drop tautological nil checks in code-loading paths (…

9b15362

…govet) stateObject and s are both verified non-nil earlier in their respective scopes; the secondary checks at lines 749 and 783 are redundant. govet nilness check fails on these.

This was referenced May 26, 2026

FCU semaphore decouple + foreground-priority bg-commit worker (PR #4 of the perf stack) #21414

Open

BAL-driven parallel commitment (PR #5 of the perf stack) #21416

Open

mh0lt mentioned this pull request Jun 4, 2026

db/state/execctx: fix AccountsDomain codeHash decode (DeserialiseV3) — correctness #21623

Closed

sudeepdino008 mentioned this pull request Jun 5, 2026

db/state: persist domain file cache across rotxs, survive merges #21627

Draft

mh0lt force-pushed the mh/perf-statecache-lru-pr branch from 5053ad8 to ba6c67a Compare June 5, 2026 09:51

mh0lt force-pushed the mh/perf-statecache-lru-pr branch from 355e25f to cf05caf Compare June 5, 2026 17:10

mh0lt and others added 3 commits June 7, 2026 10:24

mh0lt mentioned this pull request Jun 7, 2026

db/state/kvmetrics: process-level channel-fed KV-read metrics collector #21663

Merged

sudeepdino008 mentioned this pull request Jun 8, 2026

State-cache L2b code-by-hash bypass is unsound under parallel (OCC) execution #21675

Closed

mh0lt and others added 3 commits June 8, 2026 10:56

sudeepdino008 mentioned this pull request Jun 10, 2026

[wip/DO NOT MERGE ] db/state: lock-free DomainMetrics + kv_get (atomics) #21722

Draft

yperbasis added this to the 3.6.0 milestone Jun 10, 2026

mh0lt mentioned this pull request Jun 10, 2026

State Cache Consolidation (PR #1 of the perf stack) #21380

Open

mh0lt and others added 2 commits June 10, 2026 23:30

mh0lt and others added 3 commits June 11, 2026 08:24

mh0lt mentioned this pull request Jun 11, 2026

kv/state caches: tidy FlushOption + make cache consistency under unwind (txNum, epoch)-based without full iteration #21752

Open

mh0lt and others added 4 commits June 11, 2026 09:44

kv: align FlushConfig comment with #21380

e43f031

Byte-match #21380's WithFlushCallback txNum doc so re-syncing this stacked branch onto #21380 attributes the kv plumbing cleanly to #21380 (#21752). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Merge remote-tracking branch 'origin/mh/perf-caches-pr' into mh/perf-…

31117b1

…statecache-lru-pr # Conflicts: # db/state/execctx/domain_shared.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StateCache LRU + Mode rework (PR #2 of the perf stack)#21386

StateCache LRU + Mode rework (PR #2 of the perf stack)#21386
mh0lt wants to merge 44 commits into
mh/perf-caches-prfrom
mh/perf-statecache-lru-pr

mh0lt commented May 24, 2026

Uh oh!

sudeepdino008 commented Jun 5, 2026

Uh oh!

sudeepdino008 commented Jun 5, 2026

Uh oh!

mh0lt commented Jun 5, 2026

Uh oh!

sudeepdino008 commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mh0lt commented May 24, 2026

Scope — 11 commits cherry-picked from mh/all-stack

One commit deferred

Uh oh!

sudeepdino008 commented Jun 5, 2026

Deterministic gas-used mismatch at mainnet chain tip with state cache on (parallel exec)

Symptom

Environment

Control experiment

Notes / hypothesis

Uh oh!

sudeepdino008 commented Jun 5, 2026

Root cause of the gas-mismatch / wrong-trie-root failures: L2b bypass breaks DomainPut's prevVal read on EIP-7702 delegations

Bisection

Smoking gun

Mechanism (EIP-7702 lost-write)

Fix directions

Uh oh!

mh0lt commented Jun 5, 2026

Uh oh!

sudeepdino008 commented Jun 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Scope — 11 commits cherry-picked from `mh/all-stack`

Root cause of the gas-mismatch / wrong-trie-root failures: L2b bypass breaks `DomainPut`'s prevVal read on EIP-7702 delegations