execbuilder: fix a stats-related flake in a new test#160760
execbuilder: fix a stats-related flake in a new test#160760craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
Release note: None
mgartner
left a comment
There was a problem hiding this comment.
@mgartner reviewed 1 file and all commit messages, and made 1 comment.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @michae2).
|
TFTR! bors r+ |
|
bors r+ |
160632: sql/bulkmerge: reuse SST iterator across bulk merge tasks r=spilchen a=spilchen This change reduces overhead in the bulk merge processor by initializing a single iterator over all input SSTs at startup, rather than creating a new one per task. The iterator is reused across tasks, seeking only when needed. Informs #159414 Epic: CRDB-48845 Release note: none Co-authored by: `@jeffswenson` 160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich Fixes: #160752. Fixes: #160753. Release note: None 160764: sql/copy: fix rare flake in TestLargeCopy r=yuzefovich a=yuzefovich We have automatic retry mechanism for COPY but it can only be used for non-atomic COPY. If we have the atomic COPY and hit a txn retry error, it's bubbled up to the client. We now adjust `TestLargeCopy` to match this behavior fixing a rare flake where we'd fail the test on the txn retry error when we should've ignored it. Fixes: #160537. Release note: None Co-authored-by: Matt Spilchen <matt.spilchen@cockroachlabs.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
|
Build failed (retrying...): |
160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich Fixes: #160752. Fixes: #160753. Release note: None 160764: sql/copy: fix rare flake in TestLargeCopy r=yuzefovich a=yuzefovich We have automatic retry mechanism for COPY but it can only be used for non-atomic COPY. If we have the atomic COPY and hit a txn retry error, it's bubbled up to the client. We now adjust `TestLargeCopy` to match this behavior fixing a rare flake where we'd fail the test on the txn retry error when we should've ignored it. Fixes: #160537. Release note: None Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
|
Build failed (retrying...): |
157869: colexecop: modify Operator.Next to return metadata r=yuzefovich a=yuzefovich This commit changes `Operator.Next` interface to add an ability to return metadata in the "streaming" fashion, similar to how it's done in the row-by-row engine. This will allow us to support features like estimated query progress (based on the number of rows scanned so far). Note that we still preserve the concept of MetadataSources, but it'll become similar to "trailing metadata" concept in the row-by-row engine, i.e. it'll only handle metadata that is produced when the component is being drained. The following Operators are affected have non-trivial state transitions: - columnarizer - we're buffering multiple rows into a batch, so we need to continue building the current batch if there is some metadata between rows. Note that we don't change the error propagation and still throw it immediately. Additionally, since we no longer buffer the metadata, we can remove some memory accounting. - ordered synchronizer is similar to columnarizer - we need to ensure continuing building the current output batch if we get a metadata when fetching the new batch from one of the inputs. Additionaly we need handle the heap initialization carefully if we get a metadata object before any batches from some inputs. - parallel unordered synchronizer - here we needed to augment the logic so that metadata can be propagated in the streaming fashion. Previously, if an input sent the metadata, then it meant that the input has been drained, which is no longer true so we store the last message from one of the inputs and emit the metadata from it until exhausted. This allowed us to simplify some other logic about buffering the metadata though. - top K sort is somewhat similar to the ordered synchronizer - it first spills some number of batches to reach the desired K total, and then it reads one batch at a time comparing against the current maximum row (one input row from the batch at a time). Here we needed to adjust the logic to initialize the heap only once as well as maintain some state across Next calls. - hashBasedPartitioner - if the metadata is received from the 2nd input after having read a batch from the 1st input, then that batch must be used on the next iteration - hashJoiner - whenever the metadata is received from either input, it's propagated immediately, but then the execution flow can just be resumed (because the hashJoiner already has the necessary state machine) - crossJoiner and mergeJoiner read from inputs at different points in time and have non-trivial state transitions, so we handle them by introducing a helper struct that buffers the metadata, and then the operator must check for metadata as soon as its convenient. In the merge joiner we also need to continue building the output batch without resetting. Additionally, just to be safe, we mark both as MetadataSources to guarantee that all metadata is emitted (I think that only with ungraceful shutdown we can have some metadata still buffered in these two). - index joiner - need to preserve the row count that we've buffered for span generation so far, across Next calls, as well as reset the input batch mem size only when we're transitioning to constructing spans. - hash router - here we'll forward the metadata to the first stream (similar to what we do in the row-by-row routers). Additionally each router output is responsible for emitting forwarded to it metadata on its DrainMeta implementation (if unable to emit that forwarded metadata in the streaming fashion). - inbox - simply need to ensure that all metadata is returned as soon as possible, or at the very least during DrainMeta. Note that we still panic with an error in the metadata since we're not changing how errors are propagated throughout the vectorized engine. - outbox - here we send a new message for every metadata object from the input. Note that we could consider buffering up some metadata and sending together with the next batch, but that's left as a TODO (I doubt that minor increase in DistSQL messages will be noticeable, unless we start emitting metadata very frequently). This commit also extends the invariantsChecker (which is an operator that we plan in test-only builds that verifies some assumptions) to randomly inject metadata on its Next calls. This improves the test coverage of the streaming metadata in the vectorized engine (and it uncovered one bug in the top K sorter). We inject RowNum metadata that is only used in tests, and then we needed to adjust some tests to allow for that type of metadata. `NextNoMeta` helper method has been adjusted to silently swallow this metadata. Fixes: #55758. Release note: None 160095: sql: use `ExecutorConfig` in inspect logger r=bghal a=bghal The inspect resumer needs to be able to log issues. The logger can use the `ExecutorConfig` embedded in the context without changing its behavior. Part of: #155472 Epic: CRDB-55075 Release note: None 160760: execbuilder: fix a stats-related flake in a new test r=yuzefovich a=yuzefovich Fixes: #160752. Fixes: #160753. Release note: None Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Brendan Gerrity <brendan.gerrity@cockroachlabs.com>
|
Build failed (retrying...): |
|
Build failed: |
|
bors retry |
|
Build succeeded: |
|
Based on the specified backports for this PR, I applied new labels to the following linked issue(s). Please adjust the labels as needed to match the branches actually affected by the issue(s), including adding any known older branches. Issue #160753: branch-release-25.2, branch-release-25.3, branch-release-25.4, branch-release-26.1. Issue #160752: branch-release-25.3, branch-release-25.4, branch-release-26.1. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
|
blathers backport release-25.2.11-rc release-25.3.7-rc release-25.4.3-rc |
|
You have requested backports to the following End-of-Life (EOL) versions:
While Blathers will still create these backport PRs, please verify that backporting to EOL versions is intentional and appropriate. EOL versions no longer receive maintenance updates according to our support policy. The backport PRs will also include this warning. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Fixes: #160752.
Fixes: #160753.
Release note: None