execbuilder: fix a stats-related flake in a new test by yuzefovich · Pull Request #160760 · cockroachdb/cockroach

yuzefovich · 2026-01-09T11:10:58Z

Fixes: #160752.
Fixes: #160753.
Release note: None

Release note: None

cockroach-teamcity · 2026-01-09T11:11:10Z

This change is

mgartner

@mgartner reviewed 1 file and all commit messages, and made 1 comment.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @michae2).

yuzefovich · 2026-01-09T21:08:17Z

TFTR!

bors r+

yuzefovich · 2026-01-09T22:54:55Z

bors r+

michae2

Thanks!

160632: sql/bulkmerge: reuse SST iterator across bulk merge tasks r=spilchen a=spilchen This change reduces overhead in the bulk merge processor by initializing a single iterator over all input SSTs at startup, rather than creating a new one per task. The iterator is reused across tasks, seeking only when needed. Informs #159414 Epic: CRDB-48845 Release note: none Co-authored by: `@jeffswenson` 160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich Fixes: #160752. Fixes: #160753. Release note: None 160764: sql/copy: fix rare flake in TestLargeCopy r=yuzefovich a=yuzefovich We have automatic retry mechanism for COPY but it can only be used for non-atomic COPY. If we have the atomic COPY and hit a txn retry error, it's bubbled up to the client. We now adjust `TestLargeCopy` to match this behavior fixing a rare flake where we'd fail the test on the txn retry error when we should've ignored it. Fixes: #160537. Release note: None Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>

craig · 2026-01-10T00:02:02Z

Build failed (retrying...):

local_roachtest

160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich Fixes: #160752. Fixes: #160753. Release note: None 160764: sql/copy: fix rare flake in TestLargeCopy r=yuzefovich a=yuzefovich We have automatic retry mechanism for COPY but it can only be used for non-atomic COPY. If we have the atomic COPY and hit a txn retry error, it's bubbled up to the client. We now adjust `TestLargeCopy` to match this behavior fixing a rare flake where we'd fail the test on the txn retry error when we should've ignored it. Fixes: #160537. Release note: None Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>

craig · 2026-01-10T00:32:21Z

Build failed (retrying...):

unit_tests

157869: colexecop: modify Operator.Next to return metadata r=yuzefovich a=yuzefovich This commit changes `Operator.Next` interface to add an ability to return metadata in the "streaming" fashion, similar to how it's done in the row-by-row engine. This will allow us to support features like estimated query progress (based on the number of rows scanned so far). Note that we still preserve the concept of MetadataSources, but it'll become similar to "trailing metadata" concept in the row-by-row engine, i.e. it'll only handle metadata that is produced when the component is being drained. The following Operators are affected have non-trivial state transitions: - columnarizer - we're buffering multiple rows into a batch, so we need to continue building the current batch if there is some metadata between rows. Note that we don't change the error propagation and still throw it immediately. Additionally, since we no longer buffer the metadata, we can remove some memory accounting. - ordered synchronizer is similar to columnarizer - we need to ensure continuing building the current output batch if we get a metadata when fetching the new batch from one of the inputs. Additionaly we need handle the heap initialization carefully if we get a metadata object before any batches from some inputs. - parallel unordered synchronizer - here we needed to augment the logic so that metadata can be propagated in the streaming fashion. Previously, if an input sent the metadata, then it meant that the input has been drained, which is no longer true so we store the last message from one of the inputs and emit the metadata from it until exhausted. This allowed us to simplify some other logic about buffering the metadata though. - top K sort is somewhat similar to the ordered synchronizer - it first spills some number of batches to reach the desired K total, and then it reads one batch at a time comparing against the current maximum row (one input row from the batch at a time). Here we needed to adjust the logic to initialize the heap only once as well as maintain some state across Next calls. - hashBasedPartitioner - if the metadata is received from the 2nd input after having read a batch from the 1st input, then that batch must be used on the next iteration - hashJoiner - whenever the metadata is received from either input, it's propagated immediately, but then the execution flow can just be resumed (because the hashJoiner already has the necessary state machine) - crossJoiner and mergeJoiner read from inputs at different points in time and have non-trivial state transitions, so we handle them by introducing a helper struct that buffers the metadata, and then the operator must check for metadata as soon as its convenient. In the merge joiner we also need to continue building the output batch without resetting. Additionally, just to be safe, we mark both as MetadataSources to guarantee that all metadata is emitted (I think that only with ungraceful shutdown we can have some metadata still buffered in these two). - index joiner - need to preserve the row count that we've buffered for span generation so far, across Next calls, as well as reset the input batch mem size only when we're transitioning to constructing spans. - hash router - here we'll forward the metadata to the first stream (similar to what we do in the row-by-row routers). Additionally each router output is responsible for emitting forwarded to it metadata on its DrainMeta implementation (if unable to emit that forwarded metadata in the streaming fashion). - inbox - simply need to ensure that all metadata is returned as soon as possible, or at the very least during DrainMeta. Note that we still panic with an error in the metadata since we're not changing how errors are propagated throughout the vectorized engine. - outbox - here we send a new message for every metadata object from the input. Note that we could consider buffering up some metadata and sending together with the next batch, but that's left as a TODO (I doubt that minor increase in DistSQL messages will be noticeable, unless we start emitting metadata very frequently). This commit also extends the invariantsChecker (which is an operator that we plan in test-only builds that verifies some assumptions) to randomly inject metadata on its Next calls. This improves the test coverage of the streaming metadata in the vectorized engine (and it uncovered one bug in the top K sorter). We inject RowNum metadata that is only used in tests, and then we needed to adjust some tests to allow for that type of metadata. `NextNoMeta` helper method has been adjusted to silently swallow this metadata. Fixes: #55758. Release note: None 160095: sql: use `ExecutorConfig` in inspect logger r=bghal a=bghal The inspect resumer needs to be able to log issues. The logger can use the `ExecutorConfig` embedded in the context without changing its behavior. Part of: #155472 Epic: CRDB-55075 Release note: None 160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich Fixes: #160752. Fixes: #160753. Release note: None Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Brendan Gerrity <brendan.gerrity@cockroachlabs.com>

craig · 2026-01-10T01:25:32Z

Build failed (retrying...):

lint

160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich Fixes: #160752. Fixes: #160753. Release note: None Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>

craig · 2026-01-10T01:55:13Z

Build failed:

lint

yuzefovich · 2026-01-10T02:23:03Z

bors retry

craig · 2026-01-10T03:00:14Z

Build succeeded:

blathers-crl · 2026-01-10T03:00:23Z

Based on the specified backports for this PR, I applied new labels to the following linked issue(s). Please adjust the labels as needed to match the branches actually affected by the issue(s), including adding any known older branches.

Issue #160753: branch-release-25.2, branch-release-25.3, branch-release-25.4, branch-release-26.1.

Issue #160752: branch-release-25.3, branch-release-25.4, branch-release-26.1.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

yuzefovich · 2026-01-15T11:46:24Z

blathers backport release-25.2.11-rc release-25.3.7-rc release-25.4.3-rc

blathers-crl · 2026-01-15T11:46:27Z

⚠️ End-of-Life Version Warning

You have requested backports to the following End-of-Life (EOL) versions:

25.2.11-rc
25.3.7-rc
25.4.3-rc

While Blathers will still create these backport PRs, please verify that backporting to EOL versions is intentional and appropriate. EOL versions no longer receive maintenance updates according to our support policy.

The backport PRs will also include this warning.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

execbuilder: fix a stats-related flake in a new test

5c37900

Release note: None

yuzefovich requested a review from michae2 January 9, 2026 11:10

yuzefovich requested a review from a team as a code owner January 9, 2026 11:10

yuzefovich added backport-25.2.x Flags PRs that need to be backported to 25.2 backport-25.3.x Flags PRs that need to be backported to 25.3 backport-25.4.x Flags PRs that need to be backported to 25.4 backport-26.1.x Flags PRs that need to be backported to 26.1 labels Jan 9, 2026

mgartner approved these changes Jan 9, 2026

View reviewed changes

michae2 approved these changes Jan 9, 2026

View reviewed changes

craig bot pushed a commit that referenced this pull request Jan 10, 2026

Merge #160760

6b663e5

160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich Fixes: #160752. Fixes: #160753. Release note: None Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>

craig bot merged commit ccb2b5d into cockroachdb:master Jan 10, 2026
25 checks passed

celeste-cockroachdb bot added the target-release-26.2.0 label Jan 10, 2026

blathers-crl bot mentioned this pull request Jan 10, 2026

pkg/sql/opt/exec/execbuilder/tests/5node/5node_test: TestExecBuild_distsql_scan failed #160753

Closed

blathers-crl bot mentioned this pull request Jan 10, 2026

pkg/sql/opt/exec/execbuilder/tests/5node/5node_test: TestExecBuild_distsql_scan failed #160752

Closed

This was referenced Jan 10, 2026

release-25.2: execbuilder: fix a stats-related flake in a new test #160808

Merged

release-25.3: execbuilder: fix a stats-related flake in a new test #160809

Merged

release-25.4: execbuilder: fix a stats-related flake in a new test #160810

Merged

yuzefovich mentioned this pull request Jan 10, 2026

release-26.1: execbuilder: fix a stats-related flake in a new test #160811

Merged

yuzefovich deleted the distsql-scan-flake branch January 10, 2026 03:09

This was referenced Jan 15, 2026

release-25.2.11-rc: execbuilder: fix a stats-related flake in a new test #161156

Closed

release-25.3.7-rc: execbuilder: fix a stats-related flake in a new test #161157

Closed

release-25.4.3-rc: execbuilder: fix a stats-related flake in a new test #161158

Closed

celeste-cockroachdb bot added v26.2.0-prerelease and removed target-release-26.2.0 labels Mar 11, 2026

Conversation

yuzefovich commented Jan 9, 2026

Uh oh!

cockroach-teamcity commented Jan 9, 2026

Uh oh!

mgartner left a comment

Choose a reason for hiding this comment

Uh oh!

yuzefovich commented Jan 9, 2026

Uh oh!

yuzefovich commented Jan 9, 2026

Uh oh!

michae2 left a comment

Choose a reason for hiding this comment

Uh oh!

craig bot commented Jan 10, 2026

Uh oh!

craig bot commented Jan 10, 2026

Uh oh!

craig bot commented Jan 10, 2026

Uh oh!

craig bot commented Jan 10, 2026

Uh oh!

yuzefovich commented Jan 10, 2026

Uh oh!

craig bot commented Jan 10, 2026

Uh oh!

Uh oh!

blathers-crl bot commented Jan 10, 2026

Uh oh!

yuzefovich commented Jan 15, 2026

Uh oh!

blathers-crl bot commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants