colexec: fix multiple starts of the wrapped processors#44144
colexec: fix multiple starts of the wrapped processors#44144craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
|
The issue raises several questions:
|
95ff659 to
d686659
Compare
asubiotto
left a comment
There was a problem hiding this comment.
Am I correct in saying that this patch would also work if nothing is added to the columnarizer?
should we be catching panics coming from execinfra? I think yes
I agree we should, I think we just didn't have the mechanism in place, while we added it to the vectorized engine because it had been such a problem not to catch panics.
should we modify the Operator interface so that Init can be called at most once? I think so (actually I put a TODO about it already into the code). Opened #44145
Unless we have a strong reason to allow Init to be called multiple times, I would prefer making it explicit that it should only be called only once. If there are exceptions, we can add those explicitly (i.e. a specific operator may be Inited more than once if an operator-specific reset method is called)
it's actually unclear from execinfra.RowSource interface whether it is ok to call Start method multiple times. valuesProcessor is at least one example that shows that it is not ok.
That should be explicit as well.
Reviewed 4 of 4 files at r1.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/colexec/buffer.go, line 49 at r1 (raw file):
func (b *bufferOp) Init() { if b.initStatus == OperatorNotInitialized {
Add a comment as to why we're doing the check (no harm in being obvious).
yuzefovich
left a comment
There was a problem hiding this comment.
I believe you are correct, but I added the changes to columnarizer to be safe. I think bufferOp is the only operator on which Init might be called multiple times, but I could be missing something.
TFTR!
bors r+
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)
yuzefovich
left a comment
There was a problem hiding this comment.
bors r-
Oops, missed that you left another comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)
Canceled |
Previously, wrapped processors could be started multiple times if they were in the input chain for the bufferOp (each of the CASE arms will initialize its input - the bufferOp). Now this is fixed by tracking in both Columnarizer and bufferOp whether Init has already been called. Previous behavior could lead to a crash when rowexec.valuesProcessor was wrapped because it sends a "bogus" metadata header on each call to Start, and only single header is expected whereas with multiple Inits they would be multiple headers. Release note (bug fix): Previously, CockroachDB could crash in special circumstances when vectorized execution engine is used (it was more likely to happen if `vectorize=experimental_on` setting was used). Now this has been fixed.
sql/execinfra is definitely a part of the vectorized engine as a whole, so we should be catching panics coming from it when running vectorized flows. Release note: None
d686659 to
9417336
Compare
yuzefovich
left a comment
There was a problem hiding this comment.
I also pushed a second commit to catch panics from sql/execinfra. PTAL.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto)
pkg/sql/colexec/buffer.go, line 49 at r1 (raw file):
Previously, asubiotto (Alfonso Subiotto Marqués) wrote…
Add a comment as to why we're doing the check (no harm in being obvious).
Done.
asubiotto
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r2, 1 of 1 files at r3.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)
|
TFTR! bors r+ |
|
bors r+ |
|
bors r- |
Canceled |
|
bors r+ |
43565: kvnemeses: begin scaffolding for a jepsen-style kv test r=nvanbenschoten/tbg a=danhhz Package kvnemeses exercises the KV api with random traffic and then validates that the observed behaviors are consistent with our guarantees. A set of Operations are generated which represent usage of the public KV api. These include both "workload" operations like Gets and Puts as well as "admin" operations like rebalances. These Operations can be handed to an Applier, which runs them against the KV api and records the results. Operations do allow for concurrency (this testing is much less interesting otherwise), which means that the state of the KV map is not recoverable from _only_ the input. TODO(dan): We can use RangeFeed to recover the exact KV history. This plus some Kyle magic can be used to check our transactional guarantees. TODO (in later commits) - Validate the log - CPut/InitPut/Increment/Delete - DeleteRange/ClearRange/RevertRange/Scan/ReverseScan - ChangeReplicas/TransferLease - ExportRequest - AddSSTable - Root and leaf transactions - GCRequest - Protected timestamps Release note: None 44144: colexec: fix multiple starts of the wrapped processors r=yuzefovich a=yuzefovich **colexec: fix multiple starts of the wrapped processors** Previously, wrapped processors could be started multiple times if they were in the input chain for the bufferOp (each of the CASE arms will initialize its input - the bufferOp). Now this is fixed by tracking in both Columnarizer and bufferOp whether Init has already been called. Previous behavior could lead to a crash when rowexec.valuesProcessor was wrapped because it sends a "bogus" metadata header on each call to Start, and only single header is expected whereas with multiple Inits they would be multiple headers. Fixes: #44133. Release note (bug fix): Previously, CockroachDB could crash in special circumstances when vectorized execution engine is used (it was more likely to happen if `vectorize=experimental_on` setting was used). Now this has been fixed. **execerror: catch panics coming from sql/execinfra package** sql/execinfra is definitely a part of the vectorized engine as a whole, so we should be catching panics coming from it when running vectorized flows. Release note: None 44169: sql/opt/optbuilder: resolve remaining comments from #44015 r=nvanbenschoten a=nvanbenschoten This commit resolves a few typos that were missed before #44015 was merged. Release note: None 44172: Include "/cockroach" into PATH in Docker image r=vladdy a=vladdy This adds "/cockroach" into environment's PATH in Docker image to require less typing when invoking "cockroach" commands via running container. Fixes: #44189 Co-authored-by: Daniel Harrison <daniel.harrison@gmail.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com> Co-authored-by: Vlad Artamonov <742047+vladdy@users.noreply.github.com>
Build succeeded |
colexec: fix multiple starts of the wrapped processors
Previously, wrapped processors could be started multiple times if they
were in the input chain for the bufferOp (each of the CASE arms will
initialize its input - the bufferOp). Now this is fixed by tracking in
both Columnarizer and bufferOp whether Init has already been called.
Previous behavior could lead to a crash when rowexec.valuesProcessor was
wrapped because it sends a "bogus" metadata header on each call to
Start, and only single header is expected whereas with multiple Inits
they would be multiple headers.
Fixes: #44133.
Release note (bug fix): Previously, CockroachDB could crash in special
circumstances when vectorized execution engine is used (it was more
likely to happen if
vectorize=experimental_onsetting was used). Nowthis has been fixed.
execerror: catch panics coming from sql/execinfra package
sql/execinfra is definitely a part of the vectorized engine as a whole,
so we should be catching panics coming from it when running vectorized
flows.
Release note: None