db/state: skip DB entries within file range in DomainLatestIterFile#20674
Conversation
DomainLatestIterFile.initCursorMDBX() and advanceInFiles() push DB entries into the priority heap without checking whether their step falls within the file range. After a partial lexicographic prune, the DB may retain stale entries with steps already covered by files. These stale entries should be skipped since files are authoritative there. - Add filesEndTxNum field to DomainLatestIterFile, set from domainRoTx.files.EndTxNum() during init() - In initCursorMDBX (both LargeValues and DupSort paths): change if to for loop, skip entries where endTxNum < filesEndTxNum - In advanceInFiles (both LargeValues and DupSort paths): wrap cursor advance in outer loop to skip entries within file range - Fix DupSort variable shadowing: use val := value[8:] instead of mutating value directly (would break in loop) Fixes #20474 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Reviewed. Semantics are correct across all 4 locations — I ran the PR's test plus a A few nits, non-blocking: 1. Missing LargeValues test coverage. The PR only exercises 2. if endTxNum < hi.filesEndTxNum {
prev := common.Copy(k)
for {
key, value, err = valsCursor.Next()
if err != nil { return err }
if len(key) < 8 || !bytes.Equal(key[:len(key)-8], prev) { break }
}
continue
}3. 4. 5. Description overstates the "shadowing fix". One broader noteUnlike LGTM overall. |
There was a problem hiding this comment.
Pull request overview
Updates DomainLatestIterFile (used by DebugRangeLatest) to avoid returning stale MDBX entries whose steps are already covered by snapshot files, aligning it with the “files are authoritative” behavior needed after partial/lexicographic pruning.
Changes:
- Track
filesEndTxNumand skip MDBX cursor entries whose step falls within the file-covered tx range (in both init and advance paths, for LargeValues and DupSort). - Fix DupSort value handling by avoiding mutation of the
valueslice (val := value[8:]). - Add
TestDomainLatestIterFile_PrefersFilesOverDBto reproduce and prevent the stale-DB-over-files regression.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
db/state/domain_stream.go |
Adds file-range awareness to DomainLatestIterFile’s MDBX cursor initialization/advancement so stale DB entries covered by files are skipped. |
db/state/domain_stream_test.go |
Adds a regression test ensuring DebugRangeLatest prefers file values over stale DB entries after partial prune. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ard key length - Fix cursor leak in initCursorMDBX: close valsCursor when all DB entries are skipped (within file range) and nothing is pushed to the heap. Applies to both LargeValues and DupSort paths. - Add defer iter.Close() in TestDomainLatestIterFile_PrefersFilesOverDB. - Add TestDomainLatestIterFile_PrefersFilesOverDB_LargeValues using CodeDomain to exercise the LargeValues branches in initCursorMDBX and advanceInFiles. - Add len(key) > 8 guards in LargeValues paths to prevent panics on malformed keys shorter than the 8-byte step suffix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Thanks for the thorough review! Addressed in fc77573:
Skipped nits 2 (prefix-scan efficiency) and 3 (advanceInFiles structure) — both non-blocking, and the refactoring risk outweighs the marginal benefit in these paths. |
|
please cp to release when merged |
…rigontech#20674) ## Summary - `DomainLatestIterFile.initCursorMDBX()` and `advanceInFiles()` now skip DB entries whose step falls within the file range, since files are authoritative for those entries - Applies the same "skip stale DB entries" pattern already merged in erigontech#20444 (`debugIteratePrefixLatest`) to `DomainLatestIterFile` - Improves clarity in the DupSort path (`val := value[8:]` instead of reassigning `value` directly) Fixes erigontech#20474 Part of erigontech#16239 ## Context After a partial lexicographic prune, the DB may retain stale entries with steps already covered by snapshot files. `DomainLatestIterFile` pushed these into the priority heap without checking, relying on a comment that said "DB entries will be ahead of files" — which is incorrect for prunable data. The fix adds a `filesEndTxNum` field (set from `domainRoTx.files.EndTxNum()`) and checks `endTxNum >= filesEndTxNum` before pushing. Entries within the file range are skipped, reducing unnecessary heap pushes and DB reads. All 4 affected locations (LargeValues + DupSort × initCursorMDBX + advanceInFiles) are fixed. Note: `DomainLatestIterFile` already derived `endTxNum` from the step, so the heap comparator was already picking file entries over stale DB entries at the same key. The primary value of this fix is consistency with the erigontech#16239 family and fewer unnecessary heap pushes / DB reads. ## Test plan - [x] New test `TestDomainLatestIterFile_PrefersFilesOverDB` — writes at two steps, builds files, deletes latest step from DB leaving stale entry, verifies `DebugRangeLatest` returns file value (DupSort path via AccountsDomain) - [x] New test `TestDomainLatestIterFile_PrefersFilesOverDB_LargeValues` — same scenario using CodeDomain to exercise LargeValues branches in both `initCursorMDBX` and `advanceInFiles` - [x] `go build ./db/state/...` - [x] `go test ./db/state/... -short` - [x] `make lint` (run twice, clean) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com>
Summary
DomainLatestIterFile.initCursorMDBX()andadvanceInFiles()now skip DB entries whose step falls within the file range, since files are authoritative for those entriesdebugIteratePrefixLatest) toDomainLatestIterFileval := value[8:]instead of reassigningvaluedirectly)Fixes #20474
Part of #16239
Context
After a partial lexicographic prune, the DB may retain stale entries with steps already covered by snapshot files.
DomainLatestIterFilepushed these into the priority heap without checking, relying on a comment that said "DB entries will be ahead of files" — which is incorrect for prunable data.The fix adds a
filesEndTxNumfield (set fromdomainRoTx.files.EndTxNum()) and checksendTxNum >= filesEndTxNumbefore pushing. Entries within the file range are skipped, reducing unnecessary heap pushes and DB reads. All 4 affected locations (LargeValues + DupSort × initCursorMDBX + advanceInFiles) are fixed.Note:
DomainLatestIterFilealready derivedendTxNumfrom the step, so the heap comparator was already picking file entries over stale DB entries at the same key. The primary value of this fix is consistency with the #16239 family and fewer unnecessary heap pushes / DB reads.Test plan
TestDomainLatestIterFile_PrefersFilesOverDB— writes at two steps, builds files, deletes latest step from DB leaving stale entry, verifiesDebugRangeLatestreturns file value (DupSort path via AccountsDomain)TestDomainLatestIterFile_PrefersFilesOverDB_LargeValues— same scenario using CodeDomain to exercise LargeValues branches in bothinitCursorMDBXandadvanceInFilesgo build ./db/state/...go test ./db/state/... -shortmake lint(run twice, clean)🤖 Generated with Claude Code