Skip to content

db/state: skip DB entries within file range in DomainLatestIterFile#20674

Merged
AskAlexSharov merged 5 commits into
mainfrom
fix/domain-latest-iter-skip-file-range
Apr 26, 2026
Merged

db/state: skip DB entries within file range in DomainLatestIterFile#20674
AskAlexSharov merged 5 commits into
mainfrom
fix/domain-latest-iter-skip-file-range

Conversation

@domiwei

@domiwei domiwei commented Apr 20, 2026

Copy link
Copy Markdown
Member

Summary

  • DomainLatestIterFile.initCursorMDBX() and advanceInFiles() now skip DB entries whose step falls within the file range, since files are authoritative for those entries
  • Applies the same "skip stale DB entries" pattern already merged in db/state: skip DB entries within file range in debugIteratePrefixLatest #20444 (debugIteratePrefixLatest) to DomainLatestIterFile
  • Improves clarity in the DupSort path (val := value[8:] instead of reassigning value directly)

Fixes #20474
Part of #16239

Context

After a partial lexicographic prune, the DB may retain stale entries with steps already covered by snapshot files. DomainLatestIterFile pushed these into the priority heap without checking, relying on a comment that said "DB entries will be ahead of files" — which is incorrect for prunable data.

The fix adds a filesEndTxNum field (set from domainRoTx.files.EndTxNum()) and checks endTxNum >= filesEndTxNum before pushing. Entries within the file range are skipped, reducing unnecessary heap pushes and DB reads. All 4 affected locations (LargeValues + DupSort × initCursorMDBX + advanceInFiles) are fixed.

Note: DomainLatestIterFile already derived endTxNum from the step, so the heap comparator was already picking file entries over stale DB entries at the same key. The primary value of this fix is consistency with the #16239 family and fewer unnecessary heap pushes / DB reads.

Test plan

  • New test TestDomainLatestIterFile_PrefersFilesOverDB — writes at two steps, builds files, deletes latest step from DB leaving stale entry, verifies DebugRangeLatest returns file value (DupSort path via AccountsDomain)
  • New test TestDomainLatestIterFile_PrefersFilesOverDB_LargeValues — same scenario using CodeDomain to exercise LargeValues branches in both initCursorMDBX and advanceInFiles
  • go build ./db/state/...
  • go test ./db/state/... -short
  • make lint (run twice, clean)

🤖 Generated with Claude Code

DomainLatestIterFile.initCursorMDBX() and advanceInFiles() push DB
entries into the priority heap without checking whether their step
falls within the file range. After a partial lexicographic prune, the
DB may retain stale entries with steps already covered by files. These
stale entries should be skipped since files are authoritative there.

- Add filesEndTxNum field to DomainLatestIterFile, set from
  domainRoTx.files.EndTxNum() during init()
- In initCursorMDBX (both LargeValues and DupSort paths): change if
  to for loop, skip entries where endTxNum < filesEndTxNum
- In advanceInFiles (both LargeValues and DupSort paths): wrap cursor
  advance in outer loop to skip entries within file range
- Fix DupSort variable shadowing: use val := value[8:] instead of
  mutating value directly (would break in loop)

Fixes #20474

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@sudeepdino008

Copy link
Copy Markdown
Member

Reviewed. Semantics are correct across all 4 locations — I ran the PR's test plus a CodeDomain (LargeValues) variant locally, and the full db/state suite is green.

A few nits, non-blocking:

1. Missing LargeValues test coverage. The PR only exercises AccountsDomain (DupSort). Worth adding a CodeDomain (LargeValues: true) variant so both initCursorMDBX branches and both advanceInFiles DB_CURSOR branches are actually touched by tests. A copy of the existing test with statecfg.Schema.CodeDomain + tx.Delete(d.ValuesTable, key||^step1) (instead of DeleteExact on the dup) works.

2. initCursorMDBX LargeValues loop is step-by-step instead of userKey-by-userKey. For a userKey with N steps all within the file range, it does N step-extractions via single Next() calls. Not incorrect, since every intermediate step is also within range, but it's redundant. A prefix-scan would avoid the per-step computation:

if endTxNum < hi.filesEndTxNum {
    prev := common.Copy(k)
    for {
        key, value, err = valsCursor.Next()
        if err != nil { return err }
        if len(key) < 8 || !bytes.Equal(key[:len(key)-8], prev) { break }
    }
    continue
}

3. advanceInFiles LargeValues structure. Wrapping Current() + the inner advance loop inside the outer skip-loop works but is harder to follow than separating "advance past the just-popped userKey" (once, up front) from "keep skipping within-range userKeys" (the loop). Each outer iter effectively advances past one userKey. Style only.

4. len(key) >= 8 defense. key != nil is checked but key[:len(key)-8] isn't guarded. Realistically impossible with well-formed data, but len(key) > 8 is cheap insurance in the LargeValues paths.

5. Description overstates the "shadowing fix". value = value[8:] was reassignment, not shadowing, and wasn't a bug — common.Copy(value) copied correctly from the shifted slice. The val := value[8:] change is a clarity win (especially inside the new loop), just not a bugfix.

One broader note

Unlike debugIteratePrefixLatest (where the original bug was endTxNum: math.MaxUint64, making stale DB entries win outright), DomainLatestIterFile already derived endTxNum from the step — so the heap comparator was already picking file entries over stale DB entries at the same key. The tests therefore pass with or without the fix on its own. The fix's real value is consistency with the #16239 family and fewer heap pushes / DB reads, not a user-visible bug — worth keeping in mind when phrasing the PR as a correctness fix.

LGTM overall.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates DomainLatestIterFile (used by DebugRangeLatest) to avoid returning stale MDBX entries whose steps are already covered by snapshot files, aligning it with the “files are authoritative” behavior needed after partial/lexicographic pruning.

Changes:

  • Track filesEndTxNum and skip MDBX cursor entries whose step falls within the file-covered tx range (in both init and advance paths, for LargeValues and DupSort).
  • Fix DupSort value handling by avoiding mutation of the value slice (val := value[8:]).
  • Add TestDomainLatestIterFile_PrefersFilesOverDB to reproduce and prevent the stale-DB-over-files regression.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
db/state/domain_stream.go Adds file-range awareness to DomainLatestIterFile’s MDBX cursor initialization/advancement so stale DB entries covered by files are skipped.
db/state/domain_stream_test.go Adds a regression test ensuring DebugRangeLatest prefers file values over stale DB entries after partial prune.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread db/state/domain_stream.go
Comment thread db/state/domain_stream.go
Comment thread db/state/domain_stream_test.go
…ard key length

- Fix cursor leak in initCursorMDBX: close valsCursor when all DB entries
  are skipped (within file range) and nothing is pushed to the heap.
  Applies to both LargeValues and DupSort paths.
- Add defer iter.Close() in TestDomainLatestIterFile_PrefersFilesOverDB.
- Add TestDomainLatestIterFile_PrefersFilesOverDB_LargeValues using
  CodeDomain to exercise the LargeValues branches in initCursorMDBX
  and advanceInFiles.
- Add len(key) > 8 guards in LargeValues paths to prevent panics on
  malformed keys shorter than the 8-byte step suffix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@domiwei

domiwei commented Apr 23, 2026

Copy link
Copy Markdown
Member Author

Thanks for the thorough review!

Addressed in fc77573:

  1. LargeValues test coverage — added TestDomainLatestIterFile_PrefersFilesOverDB_LargeValues using CodeDomain to exercise both initCursorMDBX and advanceInFiles LargeValues branches.
  2. len(key) > 8 defense — added guards at 3 locations in the LargeValues paths (initCursorMDBX loop condition, advanceInFiles inner loop, and changed len(k) == 0len(k) <= 8 in the break condition).
  3. Cursor leaks — added pushed bool pattern to both initCursorMDBX paths (LargeValues + DupSort) to close valsCursor when nothing is pushed to the heap.
  4. defer iter.Close() in test.
  5. PR description — reworded "shadowing fix" → "clarity improvement" and added a note that the primary value is consistency with db property: if data in files, don't read it from db #16239 + fewer heap pushes, not a user-visible correctness fix.

Skipped nits 2 (prefix-scan efficiency) and 3 (advanceInFiles structure) — both non-blocking, and the refactoring risk outweighs the marginal benefit in these paths.

@sudeepdino008

Copy link
Copy Markdown
Member

please cp to release when merged

@AskAlexSharov AskAlexSharov added this pull request to the merge queue Apr 26, 2026
Merged via the queue into main with commit 4b03d26 Apr 26, 2026
37 checks passed
@AskAlexSharov AskAlexSharov deleted the fix/domain-latest-iter-skip-file-range branch April 26, 2026 03:46
Sahil-4555 pushed a commit to Sahil-4555/erigon that referenced this pull request Apr 27, 2026
…rigontech#20674)

## Summary
- `DomainLatestIterFile.initCursorMDBX()` and `advanceInFiles()` now
skip DB entries whose step falls within the file range, since files are
authoritative for those entries
- Applies the same "skip stale DB entries" pattern already merged in
erigontech#20444 (`debugIteratePrefixLatest`) to `DomainLatestIterFile`
- Improves clarity in the DupSort path (`val := value[8:]` instead of
reassigning `value` directly)

Fixes erigontech#20474
Part of erigontech#16239

## Context

After a partial lexicographic prune, the DB may retain stale entries
with steps already covered by snapshot files. `DomainLatestIterFile`
pushed these into the priority heap without checking, relying on a
comment that said "DB entries will be ahead of files" — which is
incorrect for prunable data.

The fix adds a `filesEndTxNum` field (set from
`domainRoTx.files.EndTxNum()`) and checks `endTxNum >= filesEndTxNum`
before pushing. Entries within the file range are skipped, reducing
unnecessary heap pushes and DB reads. All 4 affected locations
(LargeValues + DupSort × initCursorMDBX + advanceInFiles) are fixed.

Note: `DomainLatestIterFile` already derived `endTxNum` from the step,
so the heap comparator was already picking file entries over stale DB
entries at the same key. The primary value of this fix is consistency
with the erigontech#16239 family and fewer unnecessary heap pushes / DB reads.

## Test plan
- [x] New test `TestDomainLatestIterFile_PrefersFilesOverDB` — writes at
two steps, builds files, deletes latest step from DB leaving stale
entry, verifies `DebugRangeLatest` returns file value (DupSort path via
AccountsDomain)
- [x] New test `TestDomainLatestIterFile_PrefersFilesOverDB_LargeValues`
— same scenario using CodeDomain to exercise LargeValues branches in
both `initCursorMDBX` and `advanceInFiles`
- [x] `go build ./db/state/...`
- [x] `go test ./db/state/... -short`
- [x] `make lint` (run twice, clean)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DomainLatestIterFile: skip DB entries within file range

4 participants