Skip to content

colflow: track closers by the vectorized flow#91971

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:cleanup-closers
Dec 8, 2022
Merged

colflow: track closers by the vectorized flow#91971
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:cleanup-closers

Conversation

@yuzefovich
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich commented Nov 16, 2022

This commit simplifies the way we track all of the colexecop.Closers
we need to close. Previously, we would track them using OpWithMetaInfo
and then many operators would be responsible for closing the components
in their input tree. This commit makes it so that we close them during
the flow cleanup. This is ok because we know that all concurrent
goroutines will have exited by the time Cleanup is called since we do
so only after Flow.Wait returns (which blocks).

The decision to put closers into OpWithMetaInfo was made in #62221
with justification "why not" that I provided. Now I have the answer to
why we should not do it. Unlike the stats collection and metadata
draining (which - at least currently - must happen in the "owner"
goroutines), for closers we have a convenient place to close them at the
flow level. The contract is such that the closure must occur eventually,
but it doesn't matter much in which goroutine it's done (as long as there
is no race) and whether it is a bit delayed. (The only downside I see is
that the tracing spans are finished with a delay in comparison to when the
relevant operations are actually done, but this has already been pretty
bad, and this commit makes things only slightly worse. This "delayed"
finish shows up as "over-extended" span when viewing traces via jaeger.)

As a result of this refactor, the assertion that all closers are closed
seems redundant - we'd effectively be asserting only that a single
method is called, thus the assertion is removed. This commit also
allowed to remove some of the complexity around ExplainVec
implementation (we no longer need to tie the cleanup to closing of the
corresponding planNode).

Epic: None

Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@yuzefovich yuzefovich force-pushed the cleanup-closers branch 5 times, most recently from a6f13b3 to 8a77325 Compare November 22, 2022 20:49
This commit simplifies the way we track all of the `colexecop.Closer`s
we need to close. Previously, we would track them using `OpWithMetaInfo`
and then many operators would be responsible for closing the components
in their input tree. This commit makes it so that we close them during
the flow cleanup. This is ok because we know that all concurrent
goroutines will have exited by the time `Cleanup` is called since we do
so only after `Flow.Wait` returns (which blocks).

The decision to put closers into `OpWithMetaInfo` was made in cockroachdb#62221
with justification "why not" that I provided. Now I have the answer to
why we should not do it. Unlike the stats collection and metadata
draining (which - at least currently - must happen in the "owner"
goroutines), for closers we have a convenient place to close them at the
flow level. The contract is such that the closure must occur eventually,
but it doesn't matter much in which goroutine it's done (as long as there
is no race) and whether it is a bit delayed. (The only downside I see is
that the tracing spans are finished with a delay in comparison to when the
relevant operations are actually done, but this has already been pretty
bad, and this commit makes things only slightly worse. This "delayed"
finish shows up as "over-extended" span when viewing traces via jaeger.)

As a result of this refactor, the assertion that all closers are closed
seems redundant - we'd effectively be asserting only that a single
method is called, thus the assertion is removed. This commit also
allowed to remove some of the complexity around `ExplainVec`
implementation (we no longer need to tie the cleanup to closing of the
corresponding planNode).

Release note: None
@yuzefovich yuzefovich marked this pull request as ready for review November 22, 2022 23:43
@yuzefovich yuzefovich requested a review from a team as a code owner November 22, 2022 23:43
@yuzefovich
Copy link
Copy Markdown
Member Author

I will hold off on merging this change for a week or so in order to get some coverage on master branch with the assertion enabled as of #91969. So even though this PR is RFAL, no rush on reviewing.

@yuzefovich
Copy link
Copy Markdown
Member Author

Friendly ping @rytaft @cucaroach

@DrewKimball
Copy link
Copy Markdown
Collaborator

This "delayed"
finish shows up as "over-extended" span when viewing traces via jaeger.

Not directly related to this PR, but could we add a "done" event to the span when the relevant component finishes to make the trace easier to understand?

@yuzefovich
Copy link
Copy Markdown
Member Author

Not directly related to this PR, but could we add a "done" event to the span when the relevant component finishes to make the trace easier to understand?

I did try Finishing the span sooner to make it better represent reality (when working on #87317) but ran into some difficulties. Your idea seems more doable though, at least in some cases - like we could easily add the corresponding "done" events in cFetcher or ColIndexJoin when eagerly closing it.

Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice refactor!

Reviewed 21 of 21 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach, @rytaft, and @yuzefovich)


pkg/sql/colflow/explain_vec.go line 128 at r1 (raw file):

				// We need to delay the cleanup until after the tree has been
				// formatted.
				defer cleanup()

[nit] Is it possible to just call this after the call to formatChains? I guess maybe it's better as-is, since we want to call it even if there's an error..

Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r+

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach, @DrewKimball, and @rytaft)


pkg/sql/colflow/explain_vec.go line 128 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] Is it possible to just call this after the call to formatChains? I guess maybe it's better as-is, since we want to call it even if there's an error..

Yeah, the error case is why I decided to stick with the defer.

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 8, 2022

Build succeeded:

@craig craig bot merged commit eb02796 into cockroachdb:master Dec 8, 2022
@yuzefovich yuzefovich deleted the cleanup-closers branch December 8, 2022 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants