Skip to content

colexec: fix multiple starts of the wrapped processors#44144

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
yuzefovich:vec-fix-multiple-starts
Jan 22, 2020
Merged

colexec: fix multiple starts of the wrapped processors#44144
craig[bot] merged 2 commits intocockroachdb:masterfrom
yuzefovich:vec-fix-multiple-starts

Conversation

@yuzefovich
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich commented Jan 20, 2020

colexec: fix multiple starts of the wrapped processors

Previously, wrapped processors could be started multiple times if they
were in the input chain for the bufferOp (each of the CASE arms will
initialize its input - the bufferOp). Now this is fixed by tracking in
both Columnarizer and bufferOp whether Init has already been called.

Previous behavior could lead to a crash when rowexec.valuesProcessor was
wrapped because it sends a "bogus" metadata header on each call to
Start, and only single header is expected whereas with multiple Inits
they would be multiple headers.

Fixes: #44133.

Release note (bug fix): Previously, CockroachDB could crash in special
circumstances when vectorized execution engine is used (it was more
likely to happen if vectorize=experimental_on setting was used). Now
this has been fixed.

execerror: catch panics coming from sql/execinfra package

sql/execinfra is definitely a part of the vectorized engine as a whole,
so we should be catching panics coming from it when running vectorized
flows.

Release note: None

@yuzefovich yuzefovich requested review from a team and asubiotto January 20, 2020 19:38
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@yuzefovich
Copy link
Copy Markdown
Member Author

The issue raises several questions:

  • should we be catching panics coming from execinfra? I think yes
  • should we modify the Operator interface so that Init can be called at most once? I think so (actually I put a TODO about it already into the code). Opened colexec: 'Operator.Init' should only be called once #44145
  • it's actually unclear from execinfra.RowSource interface whether it is ok to call Start method multiple times. valuesProcessor is at least one example that shows that it is not ok.

@yuzefovich yuzefovich force-pushed the vec-fix-multiple-starts branch from 95ff659 to d686659 Compare January 20, 2020 20:58
Copy link
Copy Markdown
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Am I correct in saying that this patch would also work if nothing is added to the columnarizer?

should we be catching panics coming from execinfra? I think yes

I agree we should, I think we just didn't have the mechanism in place, while we added it to the vectorized engine because it had been such a problem not to catch panics.

should we modify the Operator interface so that Init can be called at most once? I think so (actually I put a TODO about it already into the code). Opened #44145

Unless we have a strong reason to allow Init to be called multiple times, I would prefer making it explicit that it should only be called only once. If there are exceptions, we can add those explicitly (i.e. a specific operator may be Inited more than once if an operator-specific reset method is called)

it's actually unclear from execinfra.RowSource interface whether it is ok to call Start method multiple times. valuesProcessor is at least one example that shows that it is not ok.

That should be explicit as well.

Reviewed 4 of 4 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)


pkg/sql/colexec/buffer.go, line 49 at r1 (raw file):

func (b *bufferOp) Init() {
	if b.initStatus == OperatorNotInitialized {

Add a comment as to why we're doing the check (no harm in being obvious).

Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you are correct, but I added the changes to columnarizer to be safe. I think bufferOp is the only operator on which Init might be called multiple times, but I could be missing something.

TFTR!

bors r+

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)

Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors r-

Oops, missed that you left another comment.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @yuzefovich)

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 21, 2020

Canceled

Previously, wrapped processors could be started multiple times if they
were in the input chain for the bufferOp (each of the CASE arms will
initialize its input - the bufferOp). Now this is fixed by tracking in
both Columnarizer and bufferOp whether Init has already been called.

Previous behavior could lead to a crash when rowexec.valuesProcessor was
wrapped because it sends a "bogus" metadata header on each call to
Start, and only single header is expected whereas with multiple Inits
they would be multiple headers.

Release note (bug fix): Previously, CockroachDB could crash in special
circumstances when vectorized execution engine is used (it was more
likely to happen if `vectorize=experimental_on` setting was used). Now
this has been fixed.
sql/execinfra is definitely a part of the vectorized engine as a whole,
so we should be catching panics coming from it when running vectorized
flows.

Release note: None
@yuzefovich yuzefovich force-pushed the vec-fix-multiple-starts branch from d686659 to 9417336 Compare January 21, 2020 18:25
Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also pushed a second commit to catch panics from sql/execinfra. PTAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto)


pkg/sql/colexec/buffer.go, line 49 at r1 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

Add a comment as to why we're doing the check (no harm in being obvious).

Done.

Copy link
Copy Markdown
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r2, 1 of 1 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)

@yuzefovich
Copy link
Copy Markdown
Member Author

TFTR!

bors r+

@yuzefovich
Copy link
Copy Markdown
Member Author

bors r+

@yuzefovich
Copy link
Copy Markdown
Member Author

bors r-

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 22, 2020

Canceled

@yuzefovich
Copy link
Copy Markdown
Member Author

bors r+

craig bot pushed a commit that referenced this pull request Jan 22, 2020
43565: kvnemeses: begin scaffolding for a jepsen-style kv test r=nvanbenschoten/tbg a=danhhz

Package kvnemeses exercises the KV api with random traffic and then
validates that the observed behaviors are consistent with our
guarantees.

A set of Operations are generated which represent usage of the public KV
api. These include both "workload" operations like Gets and Puts as well
as "admin" operations like rebalances. These Operations can be handed to
an Applier, which runs them against the KV api and records the results.

Operations do allow for concurrency (this testing is much less
interesting otherwise), which means that the state of the KV map is not
recoverable from _only_ the input. TODO(dan): We can use RangeFeed to
recover the exact KV history. This plus some Kyle magic can be used to
check our transactional guarantees.

TODO (in later commits)
- Validate the log
- CPut/InitPut/Increment/Delete
- DeleteRange/ClearRange/RevertRange/Scan/ReverseScan
- ChangeReplicas/TransferLease
- ExportRequest
- AddSSTable
- Root and leaf transactions
- GCRequest
- Protected timestamps

Release note: None

44144: colexec: fix multiple starts of the wrapped processors r=yuzefovich a=yuzefovich

**colexec: fix multiple starts of the wrapped processors**

Previously, wrapped processors could be started multiple times if they
were in the input chain for the bufferOp (each of the CASE arms will
initialize its input - the bufferOp). Now this is fixed by tracking in
both Columnarizer and bufferOp whether Init has already been called.

Previous behavior could lead to a crash when rowexec.valuesProcessor was
wrapped because it sends a "bogus" metadata header on each call to
Start, and only single header is expected whereas with multiple Inits
they would be multiple headers.

Fixes: #44133.

Release note (bug fix): Previously, CockroachDB could crash in special
circumstances when vectorized execution engine is used (it was more
likely to happen if `vectorize=experimental_on` setting was used). Now
this has been fixed.

**execerror: catch panics coming from sql/execinfra package**

sql/execinfra is definitely a part of the vectorized engine as a whole,
so we should be catching panics coming from it when running vectorized
flows.

Release note: None

44169: sql/opt/optbuilder: resolve remaining comments from #44015 r=nvanbenschoten a=nvanbenschoten

This commit resolves a few typos that were missed before #44015 was merged.

Release note: None

44172: Include "/cockroach" into PATH in Docker image r=vladdy a=vladdy

This adds "/cockroach" into environment's PATH in Docker image to require less typing when invoking "cockroach" commands via running container.

Fixes: #44189

Co-authored-by: Daniel Harrison <daniel.harrison@gmail.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: Vlad Artamonov <742047+vladdy@users.noreply.github.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 22, 2020

Build succeeded

@craig craig bot merged commit 9417336 into cockroachdb:master Jan 22, 2020
@yuzefovich yuzefovich deleted the vec-fix-multiple-starts branch January 22, 2020 02:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Crash when using VECTORIZE=experimental_on

3 participants