Skip to content

db/state, db/integrity: FilesOnlyStateReader for commitment rebuild and CommitmentRoot#21026

Merged
sudeepdino008 merged 2 commits into
mainfrom
comm_reb_fix
May 7, 2026
Merged

db/state, db/integrity: FilesOnlyStateReader for commitment rebuild and CommitmentRoot#21026
sudeepdino008 merged 2 commits into
mainfrom
comm_reb_fix

Conversation

@sudeepdino008

@sudeepdino008 sudeepdino008 commented May 7, 2026

Copy link
Copy Markdown
Member

LimitedHistoryStateReader -> FilesOnlyStateReader

CommitmentRoot has a subcheck checkCommitmentRootViaRecompute: it touches/replays content of the final block in commitment.kv file and does ComputeCommitment. The resulting root hash should remain unchanged.

RebuildCommitment: it creates shard of size 16 and ultimately merged.

Both need contrainted/limited query of state data (i.e. only return value for key K from before step X)

both suffer from 2 problems today:

  • getLatestFromFiles(maxTxNum): it effectively searches only a single kv file (the one that contains maxTxNum). The files "left and right" of it are ignored.
  • LimitedHistoryStateReader: if getLatestFromFiles(maxTxNum) returns nil, it fallbacks to GetLatest.

This creates problem like:

  • commitment prefix is queried, it's not there in current step range. It's there in some previous file, so it'd return nil (because of getLatestFromFiles logic)
  • say we're building 128-192 range, we look for a storage slot key which became nil in this range; LimitedHistoryStateReader would fallback on GetLatest in this case.

FilesOnlyStateReader: it's essentially LimitedHistoryStateReader, without the fallback on GetLatest.

getLatestFromFiles walkback

the problem is mentioned above. The fix simply walks back from "maxTxNum containing snapshot" to the first snapshot.
The change only impacts when maxTxNum != uint64.max which is used in FilesOnlyStateReader, checkCommitmentRootViaFileData and commitment print command.

checkCommitmentRootViaRecompute doing TouchKey only on accounts domain.

  • checkCommitmentRootViaRecompute — switch from account-domain-only touchHistoricalKeys to sd.TouchChangedKeysFromHistory (accounts + storage; code is covered transitively via account.codeHash). This is the same helper used by rpc/rpchelper/commitment.go and receipts generation.

…ment rebuild and CommitmentRoot

Replace LimitedHistoryStateReader (which fell back to GetLatest on miss and
silently leaked post-limit values) with a strict files-only reader for the
two callers that need a fixed boundary snapshot:

- RebuildCommitmentFiles uses FilesOnlyStateReader(rwTx, lastTxnumInShard-1)
  so commitment.kv is rebuilt only from the matching accounts/storage/code
  .kv files at the shard boundary.
- CheckCommitmentRoot's checkCommitmentRootViaSd uses FilesOnlyStateReader
  at maxTxNum, and checkCommitmentRootViaRecompute switches from an
  account-domain-only touchHistoricalKeys to sd.TouchChangedKeysFromHistory
  (accounts + storage; code is covered transitively via account.codeHash).

getLatestFromFiles walkback is also fixed: the previous bound skipped files
ending before maxTxNum, which broke the walkback semantics and made
files-only reads return nil for any key whose latest write lives in an
older .kv. Now only files starting strictly after maxTxNum are skipped.
@sudeepdino008 sudeepdino008 marked this pull request as draft May 7, 2026 05:22
@sudeepdino008 sudeepdino008 changed the title db/state, db/integrity, commitmentdb: FilesOnlyStateReader for commitment rebuild and CommitmentRoot db/state, db/integrity: FilesOnlyStateReader for commitment rebuild and CommitmentRoot May 7, 2026
@sudeepdino008 sudeepdino008 marked this pull request as ready for review May 7, 2026 07:27
@sudeepdino008 sudeepdino008 enabled auto-merge May 7, 2026 07:28
@sudeepdino008 sudeepdino008 added this pull request to the merge queue May 7, 2026
Merged via the queue into main with commit a81cc87 May 7, 2026
37 checks passed
@sudeepdino008 sudeepdino008 deleted the comm_reb_fix branch May 7, 2026 08:56
AskAlexSharov pushed a commit that referenced this pull request May 7, 2026
Cherry-pick of #20930, #21025, and #21026 onto performance.

---------

Co-authored-by: sudeepdino008 <sudeepdino008@gmail.com>
Sahil-4555 pushed a commit to Sahil-4555/erigon that referenced this pull request May 8, 2026
…h#21029)

## Summary
- Filter `.kv` files before slicing in `CheckCommitmentRoot`, so
`onlyCheckLastFile` picks the latest `.kv` instead of whatever
`aggTx.Files` returned last (which was often the `.ef`).
- Default `CHECK_COMMITMENT_ROOT_ONLY_LAST_FILE` and
`CHECK_COMMITMENT_ROOT_ONLY_LAST_FILE_RECOMPUTE` to `false` so the check
now covers all `.kv` files by default.
- Add a discovery log line and a clearer warning when no `.kv` files are
present.
- with erigontech#21026 -- recompute
check works fine now; and it is fast enough to do for all kv files

## Test plan
- [ ] `make lint`
- [ ] `make erigon integration`
- [ ] Run `erigon seg integrity --check=CommitmentRoot` against a
datadir and confirm all `.kv` files get checked.
awskii added a commit that referenced this pull request May 13, 2026
…build and CommitmentRoot (#21143)

Cherry-pick of #21026 to `release/3.4`.

Conflicts resolved:
- `db/state/squeeze.go` — kept the
`SetStateReader(NewFilesOnlyStateReader(...))` swap; dropped the
unrelated `if concurrent { EnableParaTrieDB }` context lines that don't
exist on `release/3.4`.
- `execution/commitment/commitmentdb/commitment_context.go` — removed
`SetLimitedHistoryStateReader` as in the original PR; did not introduce
`SetCustomHistoryStateReader` since it isn't part of this PR's diff and
isn't present on `release/3.4`.

Co-authored-by: moskud <sudeepdino008@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants