Skip to content

recentIterateRange (inverted index): DB query reads data within file range (prunable data) #20358

@sudeepdino008

Description

@sudeepdino008

Parent issue

Part of #16239 — avoiding inconsistency due to lexicographic prune.

Problem

In db/state/inverted_index.go, the recentIterateRange function queries DB for the full [startTxNum, endTxNum) range when the range extends beyond files:

var from []byte
if startTxNum >= 0 {
    from = make([]byte, 8)
    binary.BigEndian.PutUint64(from, uint64(startTxNum))
}
var to []byte
if endTxNum >= 0 {
    to = make([]byte, 8)
    binary.BigEndian.PutUint64(to, uint64(endTxNum))
}
it, err := roTx.RangeDupSort(iit.ii.ValuesTable, key, from, to, asc, limit)

The isFrozenRange early return only handles the case where the entire range is covered by files. When the range spans both file and DB ranges (e.g., startTxNum is within files but endTxNum is beyond), the DB is queried from startTxNum, returning txNums already covered by files.

The caller IdxRange unions file results with DB results via stream.Union which deduplicates, so correctness is maintained today. But the DB read depends on data that should be prunable — if those DB entries are pruned in any order, the DB results would be incomplete (though the union with files would still produce correct results).

Fix

Adjust the DB query bounds to exclude the file range:

dbStartTxNum := startTxNum
dbEndTxNum := endTxNum
if len(iit.files) > 0 {
    filesEndTxNum := int(iit.files.EndTxNum())
    if asc {
        // For ascending: adjust lower bound
        dbStartTxNum = max(startTxNum, filesEndTxNum)
    } else {
        // For descending: adjust lower bound (to is lower bound in desc)
        if dbEndTxNum >= 0 {
            dbEndTxNum = max(endTxNum, filesEndTxNum)
        }
    }
}

Then use dbStartTxNum / dbEndTxNum when building the from / to byte slices.

For ascending: the file iterator handles [startTxNum, files.EndTxNum()) and the DB handles [files.EndTxNum(), endTxNum).
For descending: the file iterator handles [endTxNum, files.EndTxNum()) and the DB handles [files.EndTxNum(), startTxNum].

Test needed

Add TestInvertedIndex_IdxRange_SkipsFileRange: write inverted index entries across multiple steps, build files, call IdxRange spanning both file and DB ranges, prune DB entries within the file range, verify results are still correct.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions