db/state: fix collation/pruning race and unwind visibility in BuildFilesInBackground by yperbasis · Pull Request #20445 · erigontech/erigon

yperbasis · 2026-04-09T13:43:12Z

Summary

Fixes two related bugs in the domain state layer that cause gas mismatches during execution (#20169).

Bug 1: Collation/pruning race

BuildFilesInBackground collates domain files by reading values from the DB via per-worker read-transactions opened at collation time. Between the moment a step is deemed ready and the collation reads, execution can commit new batches that overwrite step S values with step S+1 data. The collated file for step S then contains wrong values. After pruning removes the DB entries, GetLatest returns the stale file values, causing SSTORE gas mispricing (20,000 vs 2,900 = 17,100 gas per affected slot).

Fix: Restructure buildFiles into two phases:

Phase 1 (sequential, single read-tx): Collate all domains and indices using one MDBX read-transaction. All collations see the exact same DB snapshot — zero race window.
Phase 2 (parallel, no DB access): Build files from collations. This is the expensive part and remains fully parallel.

Additionally, add a committed txNum guard: don't collate step S until ComputeCommitment has confirmed all data through the end of step S is flushed.

Bug 2: Unwind entry visibility after filing

After a reorg, the unwind restores domain entries tagged with a step derived from the unwind-target txNum. If BuildFilesInBackground has filed that step, getLatestFromDb discards the restored entry (step covered by files) and falls through to getLatestFromFiles, returning the stale end-of-step value instead of the changeset-restored value.

The fix must use the current (atomically published) file boundary from Aggregator.EndTxNumMinimax(), not the tx-scoped DomainRoTx.FirstStepNotInFiles() — because BuildFilesInBackground may have filed new steps since the tx was opened, and getLatestFromDb will see those via the current view.

Fix: Pass Aggregator.EndTxNumMinimax() into the unwind and tag restored entries with max(naturalStep, currentFilesEndStep).

Changes

db/state/aggregator.go: single-tx collation + parallel file building; committed txNum guard; stepFullyCommitted helper; pass current file boundary to unwind
db/state/domain.go: bump unwind step tag past current filed range
db/state/aggregator_test.go: TestAggregator_CommittedTxNumGuard
db/state/domain_test.go: TestDomain_CollationIsolatedFromLaterSteps, TestDomain_UnwindRestoredEntryVisibility
execution/commitment/commitmentdb/commitment_context.go: export DecodeTxBlockNums; fix minUnwindale typo; short-value length guard

Test plan

make lint passes
TestDomain_CollationIsolatedFromLaterSteps — step S collation ignores step S+1 data
TestDomain_UnwindRestoredEntryVisibility — unwind-restored entries survive getLatestFromDb after filing (fails without fix)
TestAggregator_CommittedTxNumGuard — guard blocks premature file building
Integration: sync mainnet to chain tip without gas mismatches, including after reorgs

🤖 Generated with Claude Code

When a storage slot is zeroed (SSTORE→0), the DB stores an empty-value entry. If snapshot files later advance past that entry's step, getLatestFromDb discarded it (returned found=false) because of the lastTxNumOfStep < files.EndTxNum() guard. The caller then fell through to getLatestFromFiles, which scans files newest-to-oldest. Since snapshot files represent deletions as absent keys (no tombstone), the scan reached an older file containing the pre-deletion value and returned stale data. This caused EIP-2200 SSTORE gas mispricing: GetCommittedState returned a non-zero "original" value for a deleted slot, taking the SSTORE_RESET path (2,900 gas) instead of SSTORE_SET (20,000 gas), producing gasUsed deficits that fail block validation. Fix: treat DB deletion entries (len(v)==0) as authoritative regardless of step age. The DB returns the most-recent entry per key (inverted-step ordering), so an empty value means the key was deleted and no newer write exists. Fixes #20169 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Return v ([]byte{}) instead of nil to preserve the empty-vs-nil distinction. DeleteWithPrev and PutWithPrev(key, []byte{}, ...) are indistinguishable at the DB level — both store empty values. Returning nil broke TestDomain_GetAfterAggregation which writes legitimate empty values and expects to read them back as []byte{}. The critical part of the fix remains: found=true prevents the fallthrough to getLatestFromFiles. For storage gas pricing, []byte{} is correctly treated as zero by ReadAccountStorage (checks len(enc) > 0). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…eletion-entries

…otstrapping Two additional fixes for the gas mismatch issue (#20169): 1. **Preserve deletion entries during domain pruning** (db/state/domain.go): The background file collation (`buildFilesInBackground`) runs concurrently with block execution. When it opens its MDBX read-transaction before the execution's CommitCycle flushes a deletion entry, the resulting domain file silently misses that deletion. Pruning then removes the DB entry because the file "covers" that step — leaving no record of the deletion in either the DB or the files. Subsequent reads fall through to an older file and return a stale pre-deletion value, causing EIP-2200 SSTORE gas mispricing. Fix: route Prune() through oldPrune() and skip deletion entries (empty values) during pruning. Combined with the getLatestFromDb fix from the previous commit, this ensures deletion evidence is never lost regardless of collation/execution timing. 2. **Tolerate domain-ahead-of-blocks during InsertBlocks** (inserters.go): When pre-verified state snapshots extend past the block snapshot boundary, NewSharedDomains returns ErrBehindCommitment. InsertBlocks only needs to accumulate block headers/bodies/TDs in the overlay — it does not need a valid commitment state. Allow it to proceed so the ForkChoice stage can run execution and close the TxNums gap. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The background file builder (buildFilesInBackground) runs concurrently with block execution. Its collation opens a read-transaction that can snapshot the DB before the execution's CommitCycle flushes all writes for a step. The resulting file silently misses those entries. Pruning then removes them from the DB (the file "covers" the step), and the values are lost — causing both stale reads for deleted keys and missing reads for existing keys. See #20169. Fix: the execution layer publishes its last committed txNum via Aggregator.SetLastCommittedTxNum after each CommitCycle flush+commit. The file builder checks this before collating each step: if lastCommittedTxNum < firstTxNum(step+1), the step may have uncommitted data and collation is deferred. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Previously initialCycle switched to false only when a stage-loop iteration completed in under 5 minutes. This meant the switch happened very close to the chain tip, leaving only ~50 blocks with changesets. Any reorg request beyond that depth failed with "bad forkchoice" because the node had no changesets to unwind with. Add an explicit check: if the execution stage has reached within MAX_REORG_DEPTH (default 96) blocks of the header-chain tip, switch initialCycle to false immediately. This ensures at least reorgBlockDepth blocks always have changesets, regardless of iteration timing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Reverts the InsertBlocks change from e1475aa that allowed ErrBehindCommitment to be silently ignored. The bootstrapping issue it addressed is better handled by the initialCycle switch fix in 5d257a0. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

OldPrune() is never called in production code. The deletion-entry preservation for the active prune path is handled via the IsDeletionEntry callback in TableScanningPrune. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Mirror the serial executor fix: when the parallel executor is within MaxReorgDepth of the batch end, switch initialCycle to false so changesets are generated. Without this the parallel executor can reach the tip with no changesets, making reorgs impossible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- TestDomain_DeletionSurvivesPruning: verifies that deletion entries (empty values) in the DB are preserved by the scanning-prune path's IsDeletionEntry callback, even after the step is collated into a file. - TestDomain_NonDeletionEntrySurvivesPruning: verifies that regular (non-empty) values ARE pruned from the DB after collation into a file, ensuring the deletion-preservation logic doesn't block normal pruning. - TestAggregator_LastCommittedTxNumGuard: verifies the guard logic that prevents buildFilesInBackground from collating a step whose data may not be fully committed (lastCommittedTxNum < firstTxNum(step+1)). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Address review feedback (AskAlexSharov): remove `lastCommittedTxNum` state from the Aggregator. Instead, the background file builder reads the latest committed txNum directly from the commitment domain state in the DB via `GetLatest(kv.CommitmentDomain, KeyCommitmentState)`. This avoids adding statefulness to the Aggregator and uses the existing SharedDomains.ComputeCommitment write as the synchronization point. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

This reverts commit c6c5d4a.

…lesInBackground Port of #20445 to release/3.4. Fixes two related bugs causing gas mismatches during execution (#20169). Bug 1: Collation/pruning race — BuildFilesInBackground opened a new read-transaction per domain/index, so execution could commit step S+1 data between reads. The collated file for step S then contained wrong values; after pruning, GetLatest returned stale file values. Fix: restructure buildFiles into two phases — Phase 1 collates all domains and indices using a single MDBX read-transaction; Phase 2 builds files in parallel from the collations (no DB access). Also adds a committedTxNum guard: don't collate step S until ComputeCommitment has confirmed all data through end of step S is flushed. Bug 2: Unwind entry visibility — after a reorg, restored entries were tagged with the natural step, which may already be covered by files. getLatestFromDb discarded those entries and fell through to stale file values. Fix: pass Aggregator.EndTxNumMinimax() into unwind and tag restored entries with max(naturalStep, currentFilesEndStep). Also: export DecodeTxBlockNums; fix minUnwindale typo; short-value guard.

Commit 835dd9b in ethpandaops/ethereum-package introduced GpuConfig, a Starlark built-in that requires Kurtosis CLI >=1.18.1. CI installs an older version so Starlark eval fails. Pin to e07503d (the commit just before the regression) until CI is upgraded to Kurtosis 1.18.1.

Commit 835dd9b on main introduced GpuConfig, a Starlark built-in that requires Kurtosis CLI >=1.18.1. CI installs an older version so Starlark eval fails at startup. Pin to the 6.1.0 release tag which has GLOAS support but not the breaking GpuConfig. Unpin once CI is on 1.18.1.

…tion-entries

yperbasis · 2026-04-16T06:35:12Z

Tracing results: deterministic gas mismatch at block 24,887,755

Setup

Branch fix/getLatestFromDb-deletion-entries (commit 00cee0a5)
NO_MERGE=true, --fcu.background.prune=false — merge and background prune both disabled
Fresh torrent-downloaded snapshots, clean chaindata
Added ad-hoc EVM tracer to dump SLOAD/SSTORE/BALANCE/SELFBALANCE for block 24,887,755

Per-tx gas comparison

tx	canonical gas	our gas	diff
9	720,506	703,406	−17,100
48	372,786	355,686	−17,100
134	415,794	395,894	−19,900
184	691,502	657,302	−34,200
total			−88,300

Root cause

Traced tx[9] (first diverging tx) on the real node. Found the specific corrupt value:

Contract: USDC (0xA0b86991c6218b36c1d19D4a2e9Eb0cE3606eB48)
Slot: 0xdd982ffea2f8bab549c39bd565f71cb62638c12955411757083fea7837ba6304
Canonical value at block 24,887,754: 0x0 (zero — balance was zeroed)
Our node's value: 0x30479e80 (810,106,496 — stale USDC balance)

The stale value was confirmed by SSTORE gas cost: canonical charges 20,000 (SET: slot was zero → non-zero), our node charges 2,900 (RESET: slot was already non-zero).

No earlier tx (0–8) in the block touches this slot. The stale value comes directly from GetLatest reading the domain file.

Why `eth_getStorageAt` shows correct values but execution reads stale ones

eth_getStorageAt(addr, slot, blockNum) uses GetAsOf → reads from history files → returns correct zero
Execution uses GetLatest → reads from domain .kv files → returns stale 0x30479e80
stateRoot comparison passes because it's computed from commitment history, not domain files

This is why all my earlier pre-block comparisons showed "0 mismatches" — they used eth_getStorageAt which reads correct history, while execution uses the corrupt domain files.

Which file is corrupt

The slot was non-zero canonically around block 24,850,000 (step ~8800) and zeroed by block 24,870,000 (step ~8816). The torrent-published domain file v2.0-storage.8800-8816.kv (or adjacent) captured the non-zero value but missed the deletion.

This is a DELETION entry that was lost during the original collation race on the node that built the published torrent files. The file has a stale non-zero value for a slot that should have a zero/deletion entry.

Why the earlier "0 mismatches" verification was wrong

My verification tool compared domain file entries against GetAsOf(endTxNum). Both the history AND the domain file have entries for this key — history has the correct deletion, the domain file has the stale non-zero value. The tool compared and found they both had "an entry" — it didn't detect that the VALUES differ because the domain file entry has a non-zero value while history says it should be zero at that point.

Fix needed

The getLatestFromDb deletion fix (treating DB deletion entries as authoritative, overriding stale file values) IS needed as defense-in-depth against corrupt published torrent files. The single-tx collation fix prevents producing NEW corrupt files, but can't protect against EXISTING corrupt files in the published torrent set.

AskAlexSharov · 2026-04-16T06:40:42Z

Fix needed - do it in separated PR plz. because i plan to merge this one. after #20593

yperbasis · 2026-04-16T07:19:01Z

@AskAlexSharov OK. I'm reverting the unwind fix and keeping only the collation fix. Let's merge the collation fix and do further fixes in separate PRs.

AskAlexSharov · 2026-04-16T07:21:46Z

+	// tx-scoped snapshot. BuildFilesInBackground may have filed new steps
+	// since this AggregatorRoTx was opened, and getLatestFromDb will use
+	// the current view — so the unwind must tag entries beyond it.
+	currentFilesEndStep := at.a.EndTxNumMinimax() / at.StepSize()


AskAlexSharov · 2026-04-16T07:22:02Z

current unwind fix looks ok: https://github.com/erigontech/erigon/pull/20445/changes#r3091413460

addressed

…BuildFilesInBackground (#20594) Port of #20445 to release/3.4. ## Summary Fixes two related bugs in the domain state layer that cause gas mismatches during execution (#20169). ### Bug 1: Collation/pruning race `BuildFilesInBackground` collates domain files by reading values from the DB via per-worker read-transactions opened at collation time. Between the moment a step is deemed ready and the collation reads, execution can commit new batches that overwrite step S values with step S+1 data. The collated file for step S then contains wrong values. After pruning removes the DB entries, `GetLatest` returns the stale file values, causing SSTORE gas mispricing. **Fix:** Restructure `buildFiles` into two phases: 1. **Phase 1 (sequential, single read-tx):** Collate all domains and indices using one MDBX read-transaction. All collations see the exact same DB snapshot — zero race window. 2. **Phase 2 (parallel, no DB access):** Build files from collations. This is the expensive part and remains fully parallel. Additionally, add a committed txNum guard: don't collate step S until `ComputeCommitment` has confirmed all data through the end of step S is flushed. ### Bug 2: Unwind entry visibility after filing After a reorg, the unwind restores domain entries tagged with a step derived from the unwind-target txNum. If `BuildFilesInBackground` has filed that step, `getLatestFromDb` discards the restored entry (step covered by files) and falls through to `getLatestFromFiles`, returning the stale end-of-step value instead of the changeset-restored value. **Fix:** Pass `Aggregator.EndTxNumMinimax()` into the unwind and tag restored entries with `max(naturalStep, currentFilesEndStep)`. ### Changes - `db/state/aggregator.go`: single-tx collation + parallel file building; committed txNum guard; `stepFullyCommitted` helper; pass current file boundary to unwind - `db/state/domain.go`: bump unwind step tag past current filed range - `db/state/aggregator_test.go`: `TestAggregator_CommittedTxNumGuard` - `db/state/domain_test.go`: `TestDomain_CollationIsolatedFromLaterSteps`, `TestDomain_UnwindRestoredEntryVisibility` - `execution/commitment/commitmentdb/commitment_context.go`: export `DecodeTxBlockNums`; fix `minUnwindale` typo; short-value length guard

yperbasis requested review from AskAlexSharov and sudeepdino008 as code owners April 9, 2026 13:43

yperbasis requested review from Giulio2002, awskii, lystopad, mh0lt, mriccobene and taratorio April 9, 2026 13:47

yperbasis and others added 2 commits April 9, 2026 16:07

Merge remote-tracking branch 'origin/main' into fix/getLatestFromDb-d…

714e8fe

…eletion-entries

yperbasis marked this pull request as draft April 9, 2026 20:58

yperbasis force-pushed the fix/getLatestFromDb-deletion-entries branch from 955c24c to e1475aa Compare April 10, 2026 04:22

yperbasis and others added 2 commits April 10, 2026 10:59

yperbasis force-pushed the fix/getLatestFromDb-deletion-entries branch from 326bef9 to 5d257a0 Compare April 10, 2026 09:22

yperbasis changed the title ~~db/state: fix getLatestFromDb discarding deletion entries for old steps~~ db/state: fix domain collation/pruning race causing gas mismatches Apr 10, 2026

AskAlexSharov reviewed Apr 10, 2026

View reviewed changes

Comment thread execution/execmodule/forkchoice.go Outdated

AskAlexSharov reviewed Apr 10, 2026

View reviewed changes

Comment thread execution/execmodule/forkchoice.go Outdated

yperbasis and others added 2 commits April 10, 2026 11:44

This comment was marked as low quality.

Sign in to view

AskAlexSharov reviewed Apr 10, 2026

View reviewed changes

Comment thread db/state/domain_test.go Outdated

yperbasis requested a review from Copilot April 10, 2026 10:08

Copilot started reviewing on behalf of yperbasis April 10, 2026 10:09 View session

This comment was marked as outdated.

Sign in to view

yperbasis and others added 3 commits April 15, 2026 21:49

Revert "ci: pin glamsterdam ethereum-package to last good commit"

e859e2f

This reverts commit c6c5d4a.

Merge branch 'main' into fix/getLatestFromDb-deletion-entries

3cb5cd6

ci: empty commit to trigger CI

c4545ac

AskAlexSharov mentioned this pull request Apr 16, 2026

[r3.4] db/state: fix collation/pruning race and unwind visibility in BuildFilesInBackground #20594

Merged

AskAlexSharov added 2 commits April 16, 2026 09:28

AskAlexSharov mentioned this pull request Apr 16, 2026

db/state: fix collation/pruning race and unwind visibility in BuildFilesInBackground #20596

Closed

AskAlexSharov added 5 commits April 16, 2026 11:03

Merge branch 'asharov/checking-main-ci' into fix/getLatestFromDb-dele…

675ceac

…tion-entries

save

62f9819

save

0245f36

save

7690374

Merge branch 'asharov/checking-main-ci' into fix/getLatestFromDb-dele…

5b936f8

…tion-entries

AskAlexSharov mentioned this pull request Apr 16, 2026

commitment: eliminate BranchMerger from fold→encode→write hot path #20548

Open

Merge branch 'main' into fix/getLatestFromDb-deletion-entries

18ab5e0

AskAlexSharov reviewed Apr 16, 2026

View reviewed changes

yperbasis marked this pull request as ready for review April 16, 2026 07:23

yperbasis enabled auto-merge April 16, 2026 07:25

yperbasis added this pull request to the merge queue Apr 16, 2026

Merged via the queue into main with commit 16ac985 Apr 16, 2026
65 checks passed

yperbasis deleted the fix/getLatestFromDb-deletion-entries branch April 16, 2026 09:16

yperbasis mentioned this pull request Apr 16, 2026

stagedsync: flush unwind changeset before re-execution #20607

Closed

3 tasks

AskAlexSharov mentioned this pull request Apr 24, 2026

[performance] fix issue #20169: StateCache + collation/pruning race fixes #20771

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

db/state: fix collation/pruning race and unwind visibility in BuildFilesInBackground#20445

db/state: fix collation/pruning race and unwind visibility in BuildFilesInBackground#20445
yperbasis merged 58 commits into
mainfrom
fix/getLatestFromDb-deletion-entries

yperbasis commented Apr 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

This comment was marked as low quality.

Uh oh!

This comment was marked as outdated.

Uh oh!

yperbasis commented Apr 16, 2026

Uh oh!

AskAlexSharov commented Apr 16, 2026 •

edited

Loading

Uh oh!

yperbasis commented Apr 16, 2026

Uh oh!

AskAlexSharov Apr 16, 2026

Uh oh!

AskAlexSharov commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

yperbasis commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Bug 1: Collation/pruning race

Bug 2: Unwind entry visibility after filing

Changes

Test plan

Uh oh!

Uh oh!

Uh oh!

This comment was marked as low quality.

Uh oh!

This comment was marked as outdated.

Uh oh!

yperbasis commented Apr 16, 2026

Tracing results: deterministic gas mismatch at block 24,887,755

Setup

Per-tx gas comparison

Root cause

Why eth_getStorageAt shows correct values but execution reads stale ones

Which file is corrupt

Why the earlier "0 mismatches" verification was wrong

Fix needed

Uh oh!

AskAlexSharov commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yperbasis commented Apr 16, 2026

Uh oh!

AskAlexSharov Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

AskAlexSharov commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

yperbasis commented Apr 9, 2026 •

edited

Loading

Why `eth_getStorageAt` shows correct values but execution reads stale ones

AskAlexSharov commented Apr 16, 2026 •

edited

Loading