Skip to content

sql: use real contention events instead of the mock ones#59785

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
yuzefovich:contention-events
Feb 8, 2021
Merged

sql: use real contention events instead of the mock ones#59785
craig[bot] merged 2 commits intocockroachdb:masterfrom
yuzefovich:contention-events

Conversation

@yuzefovich
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich commented Feb 3, 2021

sql: use real contention events instead of the mock ones

This commit switches to using the real contention events that KV layer
adds to the trace and starts adding them into the contention registry.
Previously, we were using the payload that came as part of
ProducerMetadata, and the corresponding field is now removed.

This required that we change in how the cumulative contention time is
calculated. Now all components that report it (colbatchscan, table
reader, join reader, inverted joiner) examine the trace obtained from
their context to find all contention events and accumulate total time.
This works because every processor starts a child spand in
startInternal already and this commit adds similar logic to the
colbatchscan.

Fixes: #59639.

Release note: None

rowexec: improve stats collection from miscellaneous joiners

Previously, there was no stats collection on the zigzag joiner, so this
commit adds it.

Additionally, it adds reporting of the number of bytes read by the join
reader and the inverted joiner as well as make all processors mentioned
implement execinfra.KVReader interface (which will allow for all to
show up nicely on the EXPLAIN ANALYZE diagrams when run via the
vectorized engine).

Release note: None

@yuzefovich yuzefovich added the do-not-merge bors won't merge a PR with this label. label Feb 3, 2021
@yuzefovich yuzefovich requested review from a team, RaduBerinde and asubiotto February 3, 2021 23:25
@yuzefovich yuzefovich requested review from a team as code owners February 3, 2021 23:25
@yuzefovich yuzefovich requested review from pbardea and removed request for a team February 3, 2021 23:25
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@yuzefovich yuzefovich removed request for a team and pbardea February 3, 2021 23:27
@yuzefovich
Copy link
Copy Markdown
Member Author

Hm, I guess it might be possible to attribute each contention event from the trace to a particular processor if each processor would record into the trace the spans (as in roachpb.Span) it has requested from the KV. Since every contention event has a contended key, we would then be able to find a single processor (probably in ExtractStatsFromSpans).

Am I missing something? Do you see another way to keep the "contention time" info for EXPLAIN ANALYZE diagrams when the events are recorded into the trace?

@yuzefovich yuzefovich force-pushed the contention-events branch 2 times, most recently from 9e400d9 to 3356f40 Compare February 4, 2021 02:37
@yuzefovich
Copy link
Copy Markdown
Member Author

Do we have plans to use node-level statistics? It is not used currently, so I added another commit that removes that.

Copy link
Copy Markdown
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this out so quickly!

Am I missing something? Do you see another way to keep the "contention time" info for EXPLAIN ANALYZE diagrams when the events are recorded into the trace?

This is what I meant when talking about the execution leaves. Instead of calling GetCumulativeContentionTime in the leaves and populating metadata with it, my thinking is that the processors would use a helper to inspect their span's payload (from the context) and calculate cumulative contention time from those. I'm guessing that there might be some subtlety in doing this in cases where we have two input branches (e.g. a joinreader) and want to get the contention time for only one branch, but that might be limited to using a separate context for the lookups to get only spans related to those lookups.

Reviewed 20 of 20 files at r1, 4 of 4 files at r2, 1 of 1 files at r3, 9 of 9 files at r4, 1 of 33 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @RaduBerinde, and @yuzefovich)


pkg/sql/distsql_running_test.go, line 270 at r5 (raw file):

// additionally verifies that the metric stays at zero if there is no
// contention.
func TestDistSQLReceiverReportsContention(t *testing.T) {

cc @tbg

@yuzefovich could you add a check that the contention registry is updated properly?


pkg/sql/execstats/traceanalyzer.go, line 89 at r3 (raw file):

// TODO(asubiotto): Flatten this struct, we're currently allocating a map per
//  stat.
type NodeLevelStats struct {

The question of whether we want to keep this is a good one. I think we do because it won't be long before our users start asking for a break down of higher-level stats by node. However, I do think that we could probably clean this up a bit and maybe offer an option to calculate query level stats from just the trace for performance reasons. In any case, this commit and discussion belongs in another PR.


pkg/sql/execstats/traceanalyzer.go, line 96 at r3 (raw file):

	KVTimeGroupedByNode           map[roachpb.NodeID]time.Duration
	NetworkMessagesGroupedByNode  map[roachpb.NodeID]int64
	ContentionTimeGroupedByNode   map[roachpb.NodeID]time.Duration

I'm confused why this commit removes this, what does it fix?

@yuzefovich yuzefovich force-pushed the contention-events branch 2 times, most recently from f021eec to 1cf00d8 Compare February 5, 2021 01:57
@yuzefovich yuzefovich removed the do-not-merge bors won't merge a PR with this label. label Feb 5, 2021
Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think it makes sense. IIUC because all processors already start a child span in the ProcessorBase.StartInternal method, the contention events only for the appropriate processor will show up in its trace (i.e. if we have a join reader that takes in an input from another table reader, the former will only see its own events). I added a similar logic to the ColBatchScan.

Is my logic here correct? Any ideas on how to test it programmatically? I'm not sure on how we can inject the contention events into the trace, I'm guessing we need to plumb a testing callback into the KV concurrency manager?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @RaduBerinde, and @tbg)


pkg/sql/distsql_running_test.go, line 270 at r5 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

cc @tbg

@yuzefovich could you add a check that the contention registry is updated properly?

Done.


pkg/sql/execstats/traceanalyzer.go, line 89 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

The question of whether we want to keep this is a good one. I think we do because it won't be long before our users start asking for a break down of higher-level stats by node. However, I do think that we could probably clean this up a bit and maybe offer an option to calculate query level stats from just the trace for performance reasons. In any case, this commit and discussion belongs in another PR.

Ok, I removed the removal commit.


pkg/sql/execstats/traceanalyzer.go, line 96 at r3 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I'm confused why this commit removes this, what does it fix?

It wasn't being used (which is a bug being fixed in another PR).

@yuzefovich yuzefovich force-pushed the contention-events branch 2 times, most recently from f4570e2 to f51981a Compare February 5, 2021 03:53
@yuzefovich
Copy link
Copy Markdown
Member Author

Confirmed manually that the join reader correctly distinguishes between the contention events originated in its input and itself. Here I blocked the table reader for a bit on one transaction and blocked the lookup join for longer on another:
Screen Shot 2021-02-04 at 7 53 21 PM

@yuzefovich yuzefovich force-pushed the contention-events branch 4 times, most recently from 508ca68 to 5e28964 Compare February 5, 2021 05:49
@yuzefovich
Copy link
Copy Markdown
Member Author

Added another commit that adds or improves the stats collection on miscellaneous joiners.

@tbg tbg requested a review from asubiotto February 8, 2021 11:23
tbg
tbg previously approved these changes Feb 8, 2021
Copy link
Copy Markdown
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @asubiotto, @RaduBerinde, @tbg, and @yuzefovich)


pkg/sql/distsql_running_test.go, line 307 at r5 (raw file):

			// Soft check to protect against flakiness where an internal query
			// causes the contention metric to increment.
			require.GreaterOrEqual(t, metrics.ContendedQueriesCount.Count(), int64(1))

I think it would be useful to have a testing knob that intercepts the contention events handled by DistSQL (i.e. whatever increments the metric). That way, you can explicitly verify that you got what you expected all the way to the collection. Just a suggestion.

@tbg tbg self-requested a review February 8, 2021 11:23
@tbg tbg dismissed their stale review February 8, 2021 11:24

approved accidentally

@tbg tbg removed their request for review February 8, 2021 11:24
@tbg
Copy link
Copy Markdown
Member

tbg commented Feb 8, 2021

FYI once #59992 is in, the contention metadata should flow without turning tracing on.

@tbg
Copy link
Copy Markdown
Member

tbg commented Feb 8, 2021

FYI once #59992 is in, the contention metadata should flow without turning tracing on.

FYI once #59992 is in, the contention metadata should flow without turning tracing on.

Heh, I tried and TestDistSQLReceiverReportsContention fails on top of that PR. I'll take a look at why that is later.

@tbg
Copy link
Copy Markdown
Member

tbg commented Feb 8, 2021

Ah, found it. There's some short-circuiting in (*Span).GetRecording that is no longer correct with structured payloads. I'll add it to my PR.

Copy link
Copy Markdown
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 11 of 33 files at r5, 41 of 41 files at r6, 8 of 8 files at r7.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto, @RaduBerinde, @tbg, and @yuzefovich)


pkg/sql/distsql_running_test.go, line 285 at r6 (raw file):

		if contention {
			// Begin a contending transaction.
			conn, err := db.Conn(ctx)

nit: I think this and the conn below need to be closed


pkg/sql/colexecbase/operator.go, line 39 at r6 (raw file):

	// happen when 'workmem' setting is too low - just bail, even if we have
	// disk spilling for that operator.
	// TODO(yuzefovich): we probably should move ctx argument from Next into

I don't think capturing the context is a good idea (it's an anti-pattern in go), but we should probably change Init to accept a context. The pattern would be that callers should use the same context for all calls.

This commit switches to using the real contention events that KV layer
adds to the trace and starts adding them into the contention registry.
Previously, we were using the payload that came as part of
ProducerMetadata, and the corresponding field is now removed.

This required that we change in how the cumulative contention time is
calculated. Now all components that report it (colbatchscan, table
reader, join reader, inverted joiner) examine the trace obtained from
their context to find all contention events and accumulate total time.
This works because every processor starts a child spand in
`startInternal` already and this commit adds similar logic to the
colbatchscan.

Release note: None
Previously, there was no stats collection on the zigzag joiner, so this
commit adds it.

Additionally, it adds reporting of the number of bytes read by the join
reader and the inverted joiner as well as make all processors mentioned
implement `execinfra.KVReader` interface (which will allow for all to
show up nicely on the EXPLAIN ANALYZE diagrams when run via the
vectorized engine).

Release note: None
Copy link
Copy Markdown
Member Author

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

I'll go ahead and merge this, and we can address the TODOs later.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto, @RaduBerinde, and @tbg)


pkg/sql/distsql_running_test.go, line 307 at r5 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I think it would be useful to have a testing knob that intercepts the contention events handled by DistSQL (i.e. whatever increments the metric). That way, you can explicitly verify that you got what you expected all the way to the collection. Just a suggestion.

Good point, left a TODO for this since it requires a bit of plumbing.


pkg/sql/distsql_running_test.go, line 285 at r6 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

nit: I think this and the conn below need to be closed

Done.


pkg/sql/colexecbase/operator.go, line 39 at r6 (raw file):

Previously, asubiotto (Alfonso Subiotto Marqués) wrote…

I don't think capturing the context is a good idea (it's an anti-pattern in go), but we should probably change Init to accept a context. The pattern would be that callers should use the same context for all calls.

But what if we derive a child context in Init? Should we return it then? What does the caller then do with that context? With the parent?

This might be an anti-pattern, but to me it makes a lot more sense. When initializing an operator, you pass in a context that can be used to signal for cancellation. At the same time, the operator is free to derive a child context from it and use that throughout its life.

So I think ideally I'd like to see the following:

// We have a root context (either of the root materializer or an outbox, also of a parallel unordered synchronizer)
// which can be used to shut everything down.
_ = ctx
...
operator.Init(ctx)
...
// Here and below the operator itself is responsible for using the appropriate context.
for operator.Next() {}
...
operator.DrainMeta()

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Feb 8, 2021

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Feb 8, 2021

Build succeeded:

@craig craig bot merged commit 9d44277 into cockroachdb:master Feb 8, 2021
@yuzefovich yuzefovich deleted the contention-events branch February 8, 2021 20:26
Copy link
Copy Markdown
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @asubiotto, @RaduBerinde, @tbg, and @yuzefovich)


pkg/sql/colexecbase/operator.go, line 39 at r6 (raw file):

But what if we derive a child context in Init? Should we return it then? What does the caller then do with that context? With the parent?

Exactly, just like a lot of other places (e.g. setting up and running flows). The pattern would be:

...
ctx = operator.Init(ctx)
...
// Here and below the operator itself is responsible for using the appropriate context.
for operator.Next(ctx) {}
...
operator.DrainMeta(ctx)

And we can probably enforce this with a testing-level assertion where Init adds a special value to the context and asserts that it is stored in the context used in Next and DrainMeta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql: replace mock roachpb.ContentionEvents with real ones

4 participants