sql: measure CPU time spent during SQL execution#93952
sql: measure CPU time spent during SQL execution#93952craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
yuzefovich
left a comment
There was a problem hiding this comment.
Nice! Can you share some testing results (manual, or maybe that single test that is disabled for tenants)?
Reviewed 12 of 12 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball)
-- commits line 6 at r1:
nit: it'd be good to explicitly mention the goroutines that SQL execution uses but which are currently not tracked by this work (since those should be relatively cheap). You mentioned on slack the async streamer goroutines, but I think the main goroutine of the hash router (colflow.HashRouter.Run) as well as the outbox (colrpc.Outbox.Run) are not tracked.
pkg/sql/colflow/stats.go line 111 at r1 (raw file):
bic.ctx = ctx bic.stopwatch.Start() bic.cpuStopwatch.Start()
nit: do you think it'd be worth to extract an abstraction in this package that would combine two of the stop watches?
pkg/sql/execinfrapb/component_stats.proto line 156 at r1 (raw file):
optional util.optional.Uint consumed_r_u = 4 [(gogoproto.nullable) = false]; // CPU time spent executing the component. optional util.optional.Duration c_p_u_time = 5 [(gogoproto.nullable) = false];
nit: you could use something like optional util.optional.Duration cpu_time = 5 [(gogoproto.nullable) = false, (gogoproto.customname) = "CPUTime"] to get the same name. This seems a bit cleaner, but up to you (ditto for consumed_ru).
pkg/util/timeutil/cpustopwatch.go line 21 at r1 (raw file):
// CPUStopWatch is a utility stop watch for measuring CPU time spent by a // component. It can be safely started and stopped multiple times, but is // not safe to use concurrently. If CPUStopWatch is nil, all operations are
Hm, I think we needed the concurrency safety at some point, and I can't recall a PR that changed that. Do you think this is no longer needed?
|
Here's explain analyze output for a simple join query: |
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: it'd be good to explicitly mention the goroutines that SQL execution uses but which are currently not tracked by this work (since those should be relatively cheap). You mentioned on slack the async streamer goroutines, but I think the main goroutine of the hash router (
colflow.HashRouter.Run) as well as the outbox (colrpc.Outbox.Run) are not tracked.
Done.
pkg/sql/colflow/stats.go line 111 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: do you think it'd be worth to extract an abstraction in this package that would combine two of the stop watches?
I embedded cpuStopWatch in StopWatch, let me know what you think.
pkg/sql/execinfrapb/component_stats.proto line 156 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: you could use something like
optional util.optional.Duration cpu_time = 5 [(gogoproto.nullable) = false, (gogoproto.customname) = "CPUTime"]to get the same name. This seems a bit cleaner, but up to you (ditto forconsumed_ru).
Agreed, done.
pkg/util/timeutil/cpustopwatch.go line 21 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Hm, I think we needed the concurrency safety at some point, and I can't recall a PR that changed that. Do you think this is no longer needed?
I didn't think it would be necessary for the vectorized stats collector, but didn't think it through very deeply. Either way, it's behind a mutex now.
yuzefovich
left a comment
There was a problem hiding this comment.
It's nice to see that the execution time (the wall time) is slightly above the CPU time.
Reviewed 9 of 9 files at r2, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
pkg/sql/colflow/stats.go line 111 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
I embedded
cpuStopWatchinStopWatch, let me know what you think.
👍
j82w
left a comment
There was a problem hiding this comment.
Reviewed all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball)
pkg/sql/opt/exec/explain/output.go line 366 at r2 (raw file):
// AddCPUTime adds a top-level field for the cumulative cpu time spent by SQL // execution. func (ob *OutputBuilder) AddCPUTime(cpuTime time.Duration) {
Please add at least an e2e test verifying the output contains a valid CPU time.
2880abb to
faa97cc
Compare
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewed 9 of 21 files at r3, 14 of 14 files at r4, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @j82w)
-- commits line 20 at r4:
I'm confused about the mention of kvBatchFetcherHelper here - there are no modifications around that struct in this PR, and also the struct is not concerned with the time measurement on master either.
pkg/sql/instrumentation.go line 567 at r4 (raw file):
ob.AddNetworkStats(queryStats.NetworkMessages, queryStats.NetworkBytesSent) ob.AddMaxDiskUsage(queryStats.MaxDiskUsage) if !ih.containsMutation && ih.vectorized && grunning.Supported() {
Remind me please why we're only doing this for vectorized plans? Due to not measuring the total execution time of a row-by-row processor?
pkg/sql/colflow/stats.go line 276 at r4 (raw file):
// In order to account for SQL CPU time, we have to subtract the CPU time // spent while serving KV requests on a SQL goroutine. cpuTime -= vsc.kvReader.GetKVCPUTime()
I wonder whether we should also expose this new KVCPUTime measurement.
Currently, on master we track a single total "execution" time of an operator, and then for TableReader processor we show it as "KV time" whereas for all other readers as "execution time". Now that we also precisely track the actual "KV time", so we could be more precise with the terminology. As is, with the current change we have that "KV time" for ColBatchScans/TableReaders includes the "sql cpu time". I think it'd be better to hide the total execution time and, instead, show the KV time and the SQL CPU time (which, in theory, should add up to the total "execution time"). I'm under the impression that we can assume grunning library is supported, right? WDYT? (There is some context in a comment in the code above.)
pkg/sql/colflow/stats.go line 280 at r4 (raw file):
s.Exec.ExecTime.Set(time) } if cpuTime > 0 && grunning.Supported() {
nit: should we add an assertion that cpuTime is non-negative?
pkg/sql/execstats/traceanalyzer.go line 382 at r4 (raw file):
} for _, cpuTime := range a.nodeLevelStats.CPUTimeGroupedByNode {
nit: it might be good to modify TestQueryLevelStatsAccumulate to also test CPUTime.
pkg/sql/opt/exec/explain/output.go line 366 at r2 (raw file):
Previously, j82w (Jake) wrote…
Please add at least an e2e test verifying the output contains a valid CPU time.
+1 we should have at least some sanity check that a positive CPU time value is measured.
Also, can the test introduced as skipped in #89256 be now unskipped? My understanding is that now the RU estimate should be more precise, or perhaps we could incorporate an adjustment to the error margin depending on the measured CPU time?
pkg/util/timeutil/cpustopwatch.go line 21 at r4 (raw file):
// CPUStopWatch is a wrapper around cpuStopWatch that is safe to use // concurrently. If CpuStopWatch is nil, all operations are no-ops and no
nit: s/CpuStopWatch/CPUStopWatch/.
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @j82w and @yuzefovich)
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I'm confused about the mention of
kvBatchFetcherHelperhere - there are no modifications around that struct in this PR, and also the struct is not concerned with the time measurement on master either.
Sorry, that's a remnant from an earlier iteration. Done.
pkg/sql/instrumentation.go line 567 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Remind me please why we're only doing this for vectorized plans? Due to not measuring the total execution time of a row-by-row processor?
It's because we measure the CPU time using vectorizedStatsCollectors. We can still measure it for row-wise processors; they just need to be wrapped in a columnarizer + statsCollector. I've expanded the comment.
pkg/sql/colflow/stats.go line 276 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I wonder whether we should also expose this new
KVCPUTimemeasurement.Currently, on master we track a single total "execution" time of an operator, and then for
TableReaderprocessor we show it as "KV time" whereas for all other readers as "execution time". Now that we also precisely track the actual "KV time", so we could be more precise with the terminology. As is, with the current change we have that "KV time" for ColBatchScans/TableReaders includes the "sql cpu time". I think it'd be better to hide the total execution time and, instead, show the KV time and the SQL CPU time (which, in theory, should add up to the total "execution time"). I'm under the impression that we can assumegrunninglibrary is supported, right? WDYT? (There is some context in a comment in the code above.)
KVCPUTime only tracks the local KV CPU time - any CPU time spent on a remote node / separate goroutine is omitted. It seems like it would be confusing to display that as-is, what do you think? Do you think I should change the name of the field to make that more clear?
pkg/sql/colflow/stats.go line 280 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: should we add an assertion that
cpuTimeis non-negative?
Done.
pkg/sql/execstats/traceanalyzer.go line 382 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: it might be good to modify
TestQueryLevelStatsAccumulateto also testCPUTime.
Good catch, done.
pkg/sql/opt/exec/explain/output.go line 366 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
+1 we should have at least some sanity check that a positive CPU time value is measured.
Also, can the test introduced as skipped in #89256 be now unskipped? My understanding is that now the RU estimate should be more precise, or perhaps we could incorporate an adjustment to the error margin depending on the measured CPU time?
Added an end-to-end test that verifies positive values for queries that output CPU time, and no output for queries with mutations.
This PR doesn't refactor the RU estimation to use grunning yet, and I think that should wait until I get it working with mutations.
pkg/util/timeutil/cpustopwatch.go line 21 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit:
s/CpuStopWatch/CPUStopWatch/.
Done.
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewed 9 of 9 files at r5, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @j82w)
pkg/sql/instrumentation.go line 567 at r4 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
It's because we measure the CPU time using
vectorizedStatsCollectors. We can still measure it for row-wise processors; they just need to be wrapped in a columnarizer + statsCollector. I've expanded the comment.
Ok, so effectively we haven't added the necessary execution statistics collection for purely row-by-row plans since it's not important enough. Consider leaving a TODO still even if we don't plan to add it immediately.
pkg/sql/colflow/stats.go line 276 at r4 (raw file):
It seems like it would be confusing to display that as-is, what do you think?
Yeah, fair point.
IIUC if we were to extend the CPU stop watches (that we are introducing in this PR to measure the KV on CPU time on the current goroutine) to also track the wall time, that wall time would be a proper KV time of an operator, right?
What I'm driving at is that KV time of TableReader on the DistSQL diagram now includes sql cpu time which can be confusing. It'd be nice to clarify this. Before this change, we only measured the wall "execution" time of the operator (which would be shown as execution time for non-readers and KV time for readers). This change introduces sql cpu time which measures on-core CPU time at the SQL level of an operator, excluding on-core KV time if any. This sql cpu time is included into the "execution time".
I think as the next step it'd be nice to explicitly measure the KV wall time a reader spent waiting for a KV request to evaluate (regardless whether it was done remotely or locally, in the same goroutine or not). We will then show this new "KV time" statistic as KV time on the diagram while showing "execution time" as execution time for all operators (i.e. we'd change it for KV readers), and it'll be such that KV (wall) time + sql cpu time <= execution (wall) time. Does this make sense?
It doesn't have to be in this PR but seems like a nice improvement with very little work after this change is done.
pkg/sql/opt/exec/explain/output_test.go line 179 at r5 (raw file):
serverutils.InitTestServerFactory(server.TestServerFactory) tc := testcluster.StartTestCluster(t, numNodes, testClusterArgs)
nit: I think whenever we start a test cluster, we add
defer leaktest.AfterTest(t)()
defer log.Scope(t).Close(t)
to the top of the test.
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @j82w and @yuzefovich)
pkg/sql/instrumentation.go line 567 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Ok, so effectively we haven't added the necessary execution statistics collection for purely row-by-row plans since it's not important enough. Consider leaving a TODO still even if we don't plan to add it immediately.
Added a TODO to lift the restrictions on pure row-by-row plans and plans with mutations.
pkg/sql/colflow/stats.go line 276 at r4 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
It seems like it would be confusing to display that as-is, what do you think?
Yeah, fair point.
IIUC if we were to extend the CPU stop watches (that we are introducing in this PR to measure the KV on CPU time on the current goroutine) to also track the wall time, that wall time would be a proper KV time of an operator, right?
What I'm driving at is that
KV timeofTableReaderon the DistSQL diagram now includessql cpu timewhich can be confusing. It'd be nice to clarify this. Before this change, we only measured the wall "execution" time of the operator (which would be shown asexecution timefor non-readers andKV timefor readers). This change introducessql cpu timewhich measures on-core CPU time at the SQL level of an operator, excluding on-core KV time if any. Thissql cpu timeis included into the "execution time".I think as the next step it'd be nice to explicitly measure the KV wall time a reader spent waiting for a KV request to evaluate (regardless whether it was done remotely or locally, in the same goroutine or not). We will then show this new "KV time" statistic as
KV timeon the diagram while showing "execution time" asexecution timefor all operators (i.e. we'd change it for KV readers), and it'll be such thatKV (wall) time + sql cpu time <= execution (wall) time. Does this make sense?It doesn't have to be in this PR but seems like a nice improvement with very little work after this change is done.
That sounds like a good idea. I'll add a TODO.
pkg/sql/opt/exec/explain/output_test.go line 179 at r5 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: I think whenever we start a test cluster, we add
defer leaktest.AfterTest(t)() defer log.Scope(t).Close(t)to the top of the test.
Done.
4508294 to
2373476
Compare
This commit adds tracking for CPU time spent during SQL execution. The CPU time is tracked at the operator granularity when statistics collection is enabled, similar to execution time. For now, the CPU time is only surfaced in the output of `EXPLAIN ANALYZE` variants. A future PR will add support for logging this value in the statement statistics. Note that the main goroutines of the streamer, hash router, and outbox are not currently tracked by this work. However, it is expected that these should be relatively cheap, and shouldn't significantly impact the measurement. Additionally, KV work is performed on a SQL goroutine in some cases (e.g. when there is a single-range request for a local range). This makes it necessary to track CPU time spent fulfilling KV requests on a SQL goroutine so it can be subtracted from the total measured CPU time. This logic is handled by the `cFetcher` and `rowFetcherStatCollector` for the operators that only perform reads. Finally, because mutations do not record stats, they currently have no way to differentiate KV CPU time from SQL CPU time. For this reason, a plan that contains mutations will not output CPU time. Informs: cockroachdb#87213 Release note (sql change): CPU time spent during SQL execution is now visible in the output of queries run with `EXPLAIN ANALYZE`. This measure does not include CPU time spent while serving KV requests, and CPU time is not shown for queries that perform mutations or for plans that aren't vectorized. This can be useful for diagnosing performance issues and optimizing SQL queries.
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewed 6 of 6 files at r6, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @j82w)
|
TFTR! bors r+ |
|
Build failed (retrying...): |
|
Build succeeded: |
This commit adds tracking for CPU time spent during SQL execution. The CPU time is tracked at the operator granularity when statistics collection is enabled, similar to execution time.
For now, the CPU time is only surfaced in the output of
EXPLAIN ANALYZEvariants. A future PR will add support for logging this value in the statement statistics.Informs: #87213
Release note (sql change): CPU time spent during SQL execution is now visible in the output of queries run with
EXPLAIN ANALYZE. This measure does not include CPU time spent while serving KV requests. This can be useful for diagnosing performance issues and optimizing SQL queries.