sql/bulkmerge: reuse SST iterator across bulk merge tasks#160632
Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom Jan 10, 2026
Merged
Conversation
This change reduces overhead in the bulk merge processor by initializing a single iterator over all input SSTs at startup, rather than creating a new one per task. The iterator is reused across tasks, seeking only when needed. Informs cockroachdb#159414 Epic: CRDB-48845 Release note: none Co-authored by: @jeffswenson
Member
mw5h
approved these changes
Jan 9, 2026
Contributor
mw5h
left a comment
There was a problem hiding this comment.
@mw5h reviewed 2 files and all commit messages, and made 1 comment.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @jeffswenson).
Contributor
Author
|
TFTR! bors r+ |
craig bot
pushed a commit
that referenced
this pull request
Jan 9, 2026
160632: sql/bulkmerge: reuse SST iterator across bulk merge tasks r=spilchen a=spilchen This change reduces overhead in the bulk merge processor by initializing a single iterator over all input SSTs at startup, rather than creating a new one per task. The iterator is reused across tasks, seeking only when needed. Informs #159414 Epic: CRDB-48845 Release note: none Co-authored by: `@jeffswenson` 160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich Fixes: #160752. Fixes: #160753. Release note: None 160764: sql/copy: fix rare flake in TestLargeCopy r=yuzefovich a=yuzefovich We have automatic retry mechanism for COPY but it can only be used for non-atomic COPY. If we have the atomic COPY and hit a txn retry error, it's bubbled up to the client. We now adjust `TestLargeCopy` to match this behavior fixing a rare flake where we'd fail the test on the txn retry error when we should've ignored it. Fixes: #160537. Release note: None Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Contributor
|
Build failed (retrying...): |
craig bot
pushed a commit
that referenced
this pull request
Jan 10, 2026
160632: sql/bulkmerge: reuse SST iterator across bulk merge tasks r=spilchen a=spilchen This change reduces overhead in the bulk merge processor by initializing a single iterator over all input SSTs at startup, rather than creating a new one per task. The iterator is reused across tasks, seeking only when needed. Informs #159414 Epic: CRDB-48845 Release note: none Co-authored by: `@jeffswenson` Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com>
Contributor
|
Build failed: |
Member
|
courtesy merge bors retry |
craig bot
pushed a commit
that referenced
this pull request
Jan 10, 2026
160632: sql/bulkmerge: reuse SST iterator across bulk merge tasks r=spilchen a=spilchen This change reduces overhead in the bulk merge processor by initializing a single iterator over all input SSTs at startup, rather than creating a new one per task. The iterator is reused across tasks, seeking only when needed. Informs #159414 Epic: CRDB-48845 Release note: none Co-authored by: `@jeffswenson` Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com>
Contributor
|
Build failed: |
Member
|
courtesy merge 2 bors retry |
craig bot
pushed a commit
that referenced
this pull request
Jan 10, 2026
160632: sql/bulkmerge: reuse SST iterator across bulk merge tasks r=spilchen a=spilchen This change reduces overhead in the bulk merge processor by initializing a single iterator over all input SSTs at startup, rather than creating a new one per task. The iterator is reused across tasks, seeking only when needed. Informs #159414 Epic: CRDB-48845 Release note: none Co-authored by: `@jeffswenson` Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com>
Contributor
|
Build failed: |
Member
|
courtesy merge 3 bors retry |
craig bot
pushed a commit
that referenced
this pull request
Jan 10, 2026
160580: opt: fix PruneUnionAllCols panic with outer scope columns r=michae2 a=DrewKimball Before this change, the PruneUnionAllCols normalization rule would panic in crdb-test builds when the projection above a UnionAll referenced columns from an outer scope (e.g., due to an apply-join or routine). This occurred because the rule computed the needed column set by combining ProjectionOuterCols and passthrough columns, which could include outer scope columns not present in the UnionAll's output. These outer columns were then passed to NeededColMapLeft/Right, which call TranslateColSetStrict and panic when given unknown columns. This change fixes the issue by intersecting the needed column set with the UnionAll's output columns before passing it to NeededColMapLeft/ Right. This ensures only columns actually present in the UnionAll are translated, preventing the panic. Fixes #159793 Release note: None Co-Authored-By: Claude <noreply@anthropic.com> 160632: sql/bulkmerge: reuse SST iterator across bulk merge tasks r=spilchen a=spilchen This change reduces overhead in the bulk merge processor by initializing a single iterator over all input SSTs at startup, rather than creating a new one per task. The iterator is reused across tasks, seeking only when needed. Informs #159414 Epic: CRDB-48845 Release note: none Co-authored by: `@jeffswenson` Co-authored-by: Drew Kimball <drewk@cockroachlabs.com> Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com>
Contributor
|
Build failed (retrying...): |
craig bot
pushed a commit
that referenced
this pull request
Jan 10, 2026
158029: colfetcher: emit periodic query progress update metadata r=yuzefovich a=yuzefovich This commit extends the query progress reporting that we do in the row-by-row tableReader to the vectorized scan operators too. Namely, after about 20k rows have been output, we'll emit the RowsRead metadata that we then use in DistSQLReceiver to update progressAtomic. Then the result shows up in `phase` column of SHOW QUERIES. Fixes: #26639. Release note (sql change): Queries executed via the vectorized engine now display their progress in `phase` column of SHOW QUERIES. Previously, this feature was only available in the row-by-row engine. 160632: sql/bulkmerge: reuse SST iterator across bulk merge tasks r=spilchen a=spilchen This change reduces overhead in the bulk merge processor by initializing a single iterator over all input SSTs at startup, rather than creating a new one per task. The iterator is reused across tasks, seeking only when needed. Informs #159414 Epic: CRDB-48845 Release note: none Co-authored by: `@jeffswenson` Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com>
Contributor
|
Build failed (retrying...): |
craig bot
pushed a commit
that referenced
this pull request
Jan 10, 2026
160632: sql/bulkmerge: reuse SST iterator across bulk merge tasks r=spilchen a=spilchen This change reduces overhead in the bulk merge processor by initializing a single iterator over all input SSTs at startup, rather than creating a new one per task. The iterator is reused across tasks, seeking only when needed. Informs #159414 Epic: CRDB-48845 Release note: none Co-authored by: `@jeffswenson` 160842: schemachange: fix recent flake in TestWorkload r=yuzefovich a=yuzefovich Recently merged in ad868ab extension to the test is flaky - fix up a couple of minor bugs. Fixes: #160814. Release note: None Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Contributor
|
Build succeeded: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This change reduces overhead in the bulk merge processor by initializing a single iterator over all input SSTs at startup, rather than creating a new one per task. The iterator is reused across tasks, seeking only when needed.
Informs #159414
Epic: CRDB-48845
Release note: none
Co-authored by: @jeffswenson