Skip to content

execbuilder: fix a stats-related flake in a new test#160760

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:distsql-scan-flake
Jan 10, 2026
Merged

execbuilder: fix a stats-related flake in a new test#160760
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:distsql-scan-flake

Conversation

@yuzefovich
Copy link
Copy Markdown
Member

Fixes: #160752.
Fixes: #160753.
Release note: None

@yuzefovich yuzefovich requested a review from michae2 January 9, 2026 11:10
@yuzefovich yuzefovich requested a review from a team as a code owner January 9, 2026 11:10
@yuzefovich yuzefovich added backport-25.2.x Flags PRs that need to be backported to 25.2 backport-25.3.x Flags PRs that need to be backported to 25.3 backport-25.4.x Flags PRs that need to be backported to 25.4 backport-26.1.x Flags PRs that need to be backported to 26.1 labels Jan 9, 2026
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@mgartner reviewed 1 file and all commit messages, and made 1 comment.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @michae2).

@yuzefovich
Copy link
Copy Markdown
Member Author

TFTR!

bors r+

@yuzefovich
Copy link
Copy Markdown
Member Author

bors r+

Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

craig bot pushed a commit that referenced this pull request Jan 9, 2026
160632: sql/bulkmerge: reuse SST iterator across bulk merge tasks r=spilchen a=spilchen

This change reduces overhead in the bulk merge processor by initializing a single iterator over all input SSTs at startup, rather than creating a new one per task. The iterator is reused across tasks, seeking only when needed.

Informs #159414
Epic: CRDB-48845
Release note: none

Co-authored by: `@jeffswenson`

160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich

Fixes: #160752.
Fixes: #160753.
Release note: None

160764: sql/copy: fix rare flake in TestLargeCopy r=yuzefovich a=yuzefovich

We have automatic retry mechanism for COPY but it can only be used for non-atomic COPY. If we have the atomic COPY and hit a txn retry error, it's bubbled up to the client. We now adjust `TestLargeCopy` to match this behavior fixing a rare flake where we'd fail the test on the txn retry error when we should've ignored it.

Fixes: #160537.
Release note: None

Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 10, 2026

Build failed (retrying...):

craig bot pushed a commit that referenced this pull request Jan 10, 2026
160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich

Fixes: #160752.
Fixes: #160753.
Release note: None

160764: sql/copy: fix rare flake in TestLargeCopy r=yuzefovich a=yuzefovich

We have automatic retry mechanism for COPY but it can only be used for non-atomic COPY. If we have the atomic COPY and hit a txn retry error, it's bubbled up to the client. We now adjust `TestLargeCopy` to match this behavior fixing a rare flake where we'd fail the test on the txn retry error when we should've ignored it.

Fixes: #160537.
Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 10, 2026

Build failed (retrying...):

craig bot pushed a commit that referenced this pull request Jan 10, 2026
157869: colexecop: modify Operator.Next to return metadata r=yuzefovich a=yuzefovich

This commit changes `Operator.Next` interface to add an ability to
return metadata in the "streaming" fashion, similar to how it's done in
the row-by-row engine. This will allow us to support features like
estimated query progress (based on the number of rows scanned so far).
Note that we still preserve the concept of MetadataSources, but it'll
become similar to "trailing metadata" concept in the row-by-row engine,
i.e. it'll only handle metadata that is produced when the component is
being drained.

The following Operators are affected have non-trivial state transitions:
- columnarizer - we're buffering multiple rows into a batch, so we need
to continue building the current batch if there is some metadata between
rows. Note that we don't change the error propagation and still throw it
immediately. Additionally, since we no longer buffer the metadata, we
can remove some memory accounting.
- ordered synchronizer is similar to columnarizer - we need to ensure
continuing building the current output batch if we get a metadata when
fetching the new batch from one of the inputs. Additionaly we need
handle the heap initialization carefully if we get a metadata object
before any batches from some inputs.
- parallel unordered synchronizer - here we needed to augment the logic
so that metadata can be propagated in the streaming fashion. Previously,
if an input sent the metadata, then it meant that the input has been
drained, which is no longer true so we store the last message from one
of the inputs and emit the metadata from it until exhausted. This
allowed us to simplify some other logic about buffering the metadata
though.
- top K sort is somewhat similar to the ordered synchronizer - it first
spills some number of batches to reach the desired K total, and then it
reads one batch at a time comparing against the current maximum row (one
input row from the batch at a time). Here we needed to adjust the logic
to initialize the heap only once as well as maintain some state across
Next calls.
- hashBasedPartitioner - if the metadata is received from the 2nd input
after having read a batch from the 1st input, then that batch must be
used on the next iteration
- hashJoiner - whenever the metadata is received from either input, it's
propagated immediately, but then the execution flow can just be resumed
(because the hashJoiner already has the necessary state machine)
- crossJoiner and mergeJoiner read from inputs at different points in
time and have non-trivial state transitions, so we handle them by
introducing a helper struct that buffers the metadata, and then the
operator must check for metadata as soon as its convenient. In the merge
joiner we also need to continue building the output batch without
resetting. Additionally, just to be safe, we mark both as MetadataSources to
guarantee that all metadata is emitted (I think that only with
ungraceful shutdown we can have some metadata still buffered in these
two).
- index joiner - need to preserve the row count that we've buffered for
span generation so far, across Next calls, as well as reset the input
batch mem size only when we're transitioning to constructing spans.
- hash router - here we'll forward the metadata to the first stream
(similar to what we do in the row-by-row routers). Additionally each
router output is responsible for emitting forwarded to it metadata on
its DrainMeta implementation (if unable to emit that forwarded metadata
in the streaming fashion).
- inbox - simply need to ensure that all metadata is returned as soon
as possible, or at the very least during DrainMeta. Note that we still
panic with an error in the metadata since we're not changing how errors
are propagated throughout the vectorized engine.
- outbox - here we send a new message for every metadata object from the
input. Note that we could consider buffering up some metadata and
sending together with the next batch, but that's left as a TODO (I doubt
that minor increase in DistSQL messages will be noticeable, unless we
start emitting metadata very frequently).

This commit also extends the invariantsChecker (which is an operator that we
plan in test-only builds that verifies some assumptions) to randomly
inject metadata on its Next calls. This improves the test coverage of
the streaming metadata in the vectorized engine (and it uncovered one
bug in the top K sorter). We inject RowNum metadata that is only used in
tests, and then we needed to adjust some tests to allow for that type of
metadata. `NextNoMeta` helper method has been adjusted to silently
swallow this metadata.

Fixes: #55758.

Release note: None

160095: sql: use `ExecutorConfig` in inspect logger r=bghal a=bghal

The inspect resumer needs to be able to log issues. The logger can use
the `ExecutorConfig` embedded in the context without changing its
behavior.

Part of: #155472
Epic: CRDB-55075

Release note: None

160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich

Fixes: #160752.
Fixes: #160753.
Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Brendan Gerrity <brendan.gerrity@cockroachlabs.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 10, 2026

Build failed (retrying...):

craig bot pushed a commit that referenced this pull request Jan 10, 2026
160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich

Fixes: #160752.
Fixes: #160753.
Release note: None

Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 10, 2026

Build failed:

@yuzefovich
Copy link
Copy Markdown
Member Author

bors retry

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 10, 2026

@craig craig bot merged commit ccb2b5d into cockroachdb:master Jan 10, 2026
25 checks passed
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Jan 10, 2026

Based on the specified backports for this PR, I applied new labels to the following linked issue(s). Please adjust the labels as needed to match the branches actually affected by the issue(s), including adding any known older branches.


Issue #160753: branch-release-25.2, branch-release-25.3, branch-release-25.4, branch-release-26.1.


Issue #160752: branch-release-25.3, branch-release-25.4, branch-release-26.1.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@yuzefovich
Copy link
Copy Markdown
Member Author

blathers backport release-25.2.11-rc release-25.3.7-rc release-25.4.3-rc

@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Jan 15, 2026

⚠️ End-of-Life Version Warning

You have requested backports to the following End-of-Life (EOL) versions:

  • 25.2.11-rc
  • 25.3.7-rc
  • 25.4.3-rc

While Blathers will still create these backport PRs, please verify that backporting to EOL versions is intentional and appropriate. EOL versions no longer receive maintenance updates according to our support policy.

The backport PRs will also include this warning.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-25.2.x Flags PRs that need to be backported to 25.2 backport-25.3.x Flags PRs that need to be backported to 25.3 backport-25.4.x Flags PRs that need to be backported to 25.4 backport-26.1.x Flags PRs that need to be backported to 26.1 v26.2.0-prerelease

Projects

None yet

4 participants