db/state: clip merge windows that straddle existing files#20909
Merged
Conversation
When the natural merge start (endTxNum minus the largest power-of-two
step span) falls strictly inside an existing visible file, bump it up
to that file's endTxNum so the candidate window cannot straddle.
Without this clip, non-power-of-2 step layouts — such as those produced
by a step-size rebase that turns one step into a non-power-of-2 number
of tx-nums — let the algorithm propose merge windows whose `from` lies
inside a pre-existing file. Downstream staticFilesInRange would then
silently drop that file and emit a merged segment whose name lies about
its coverage, leaving an overlap with the surviving straddler.
Unskip partial_overlap_must_not_be_selected (which now passes), and add
step_rebase_swallow covering the headline case where a single-step file
arriving at N=2_048_000 has natural span 16384 / natural start 2031616
that straddles [2016000, 2032000). After the clip the safe window
{2032000, 2048000} cleanly absorbs 15 trailing files.
Closed
This was referenced May 4, 2026
AskAlexSharov
approved these changes
May 4, 2026
sudeepdino008
pushed a commit
that referenced
this pull request
May 4, 2026
Cherry-pick of #20909 (by @wmitsuda), adapted for this branch. Fixes #20878. When the natural merge start (endTxNum minus the largest power-of-two step span) falls strictly inside an existing visible file, `clipMergeStartToFileBoundary` bumps it up to that file's endTxNum. Without this clip, non-power-of-2 step layouts (e.g. after a step-size rebase) let the algorithm propose windows whose `from` lies inside a pre-existing file — `staticFilesInRange` then silently drops that file and emits a merged segment that lies about its coverage. Applied to `DomainRoTx`, `HistoryRoTx`, and `InvertedIndexRoTx` `findMergeRange` (this branch has inline merge logic vs. the extracted `findMergeRangeInFiles` helper on main). Also: wrap walk errors with path context; guard `collectTorrentFiles` against a missing torrent dir.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
possible solution to fix #20878