colflow: track closers by the vectorized flow#91971
colflow: track closers by the vectorized flow#91971craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
a6f13b3 to
8a77325
Compare
This commit simplifies the way we track all of the `colexecop.Closer`s we need to close. Previously, we would track them using `OpWithMetaInfo` and then many operators would be responsible for closing the components in their input tree. This commit makes it so that we close them during the flow cleanup. This is ok because we know that all concurrent goroutines will have exited by the time `Cleanup` is called since we do so only after `Flow.Wait` returns (which blocks). The decision to put closers into `OpWithMetaInfo` was made in cockroachdb#62221 with justification "why not" that I provided. Now I have the answer to why we should not do it. Unlike the stats collection and metadata draining (which - at least currently - must happen in the "owner" goroutines), for closers we have a convenient place to close them at the flow level. The contract is such that the closure must occur eventually, but it doesn't matter much in which goroutine it's done (as long as there is no race) and whether it is a bit delayed. (The only downside I see is that the tracing spans are finished with a delay in comparison to when the relevant operations are actually done, but this has already been pretty bad, and this commit makes things only slightly worse. This "delayed" finish shows up as "over-extended" span when viewing traces via jaeger.) As a result of this refactor, the assertion that all closers are closed seems redundant - we'd effectively be asserting only that a single method is called, thus the assertion is removed. This commit also allowed to remove some of the complexity around `ExplainVec` implementation (we no longer need to tie the cleanup to closing of the corresponding planNode). Release note: None
8a77325 to
661296f
Compare
|
I will hold off on merging this change for a week or so in order to get some coverage on master branch with the assertion enabled as of #91969. So even though this PR is RFAL, no rush on reviewing. |
|
Friendly ping @rytaft @cucaroach |
Not directly related to this PR, but could we add a "done" event to the span when the relevant component finishes to make the trace easier to understand? |
I did try |
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewed 21 of 21 files at r1, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @cucaroach, @rytaft, and @yuzefovich)
pkg/sql/colflow/explain_vec.go line 128 at r1 (raw file):
// We need to delay the cleanup until after the tree has been // formatted. defer cleanup()
[nit] Is it possible to just call this after the call to formatChains? I guess maybe it's better as-is, since we want to call it even if there's an error..
yuzefovich
left a comment
There was a problem hiding this comment.
TFTR!
bors r+
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @cucaroach, @DrewKimball, and @rytaft)
pkg/sql/colflow/explain_vec.go line 128 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit] Is it possible to just call this after the call to
formatChains? I guess maybe it's better as-is, since we want to call it even if there's an error..
Yeah, the error case is why I decided to stick with the defer.
|
Build succeeded: |
This commit simplifies the way we track all of the
colexecop.Closerswe need to close. Previously, we would track them using
OpWithMetaInfoand then many operators would be responsible for closing the components
in their input tree. This commit makes it so that we close them during
the flow cleanup. This is ok because we know that all concurrent
goroutines will have exited by the time
Cleanupis called since we doso only after
Flow.Waitreturns (which blocks).The decision to put closers into
OpWithMetaInfowas made in #62221with justification "why not" that I provided. Now I have the answer to
why we should not do it. Unlike the stats collection and metadata
draining (which - at least currently - must happen in the "owner"
goroutines), for closers we have a convenient place to close them at the
flow level. The contract is such that the closure must occur eventually,
but it doesn't matter much in which goroutine it's done (as long as there
is no race) and whether it is a bit delayed. (The only downside I see is
that the tracing spans are finished with a delay in comparison to when the
relevant operations are actually done, but this has already been pretty
bad, and this commit makes things only slightly worse. This "delayed"
finish shows up as "over-extended" span when viewing traces via jaeger.)
As a result of this refactor, the assertion that all closers are closed
seems redundant - we'd effectively be asserting only that a single
method is called, thus the assertion is removed. This commit also
allowed to remove some of the complexity around
ExplainVecimplementation (we no longer need to tie the cleanup to closing of the
corresponding planNode).
Epic: None
Release note: None