roachtest/cdc: export stats for initial scan test to roachperf#92669
roachtest/cdc: export stats for initial scan test to roachperf#92669craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
pkg/cmd/roachtest/tests/cdc.go
Outdated
There was a problem hiding this comment.
Since you have to set a window for stats collection, I decided to create a start and stop API. The idea is that you call start before starting changefeeds and call export once they finish.
977a6fe to
4f846df
Compare
pkg/cmd/roachtest/tests/cdc.go
Outdated
There was a problem hiding this comment.
What do you think about not even storing a cdcStatsExporter struct at all by just having
func (ct *cdcTester) startStatsCollection() func() {
promClient, err := clusterstats.SetupCollectorPromClient(ct.ctx, ct.cluster, ct.t.L(), cfg)
if err != nil {
ct.t.Errorf("error creating prometheus client for stats collector: %s", err)
}
statsCollector := clusterstats.NewStatsCollector(ct.ctx, promClient)
startTime = timeutil.Now()
return func() { // this returned function is then called in the test to export to roachperf
endTime := timeutil.Now()
...statsCollector.Exporter().Export(...)...
}
}There was a problem hiding this comment.
I like this better than what I had, but the reason I stored the stats exporter was because it gets created in newCDCTester > startGrafana. I just pushed a small refactor. Let me know if this is better.
There was a problem hiding this comment.
Does it need to be created there though rather than just in the function itself? Grafana can be assumed to always be running for an existing cdcTester since we only don't start it when SkipInit is set since the earlier instance should still be running. I feel like its worth just commenting on the function that this requires prometheus.
If it does need to be at the very beginning, I'd rather just have it be a property in cdcTester rather than an extra return value.
There was a problem hiding this comment.
I created it in startGrafana because we had the prometheus config in scope there. It is weird to initialize it there though. I would also prefer doing it in startStatsCollection. The only issue is that we have to store the promCfg in the cdcTester, which I think is fine.
ba63cb7 to
7c6e8ee
Compare
| cpuUsageAgg = clusterstats.AggQuery{ | ||
| Stat: cpuUsage, | ||
| Query: "avg(sys_cpu_combined_percent_normalized) * 100", | ||
| Tag: "CPU Utilization (%)", | ||
| } |
There was a problem hiding this comment.
Do you know if this is just the crdb nodes or does it also include the workload node?
There was a problem hiding this comment.
Good question. This just measures the CRDB nodes. Combined/normalized refers to normalization by number of cores. The metric description is Current user+system cpu percentage, normalized 0-1 by number of cores.
This change updates the cdc/initial_scan_only test to produce a `stats.json` artifact to be consumed by roachprod. This file contains stats for p99 foreground latency, changefeed throughput, and CPU usage. Release note: None
7c6e8ee to
ee0fa07
Compare
|
bors r+ |
|
Build succeeded: |
This change updates the cdc/initial_scan_only test to produce a
stats.jsonartifact to be consumed by roachprod. This file contains stats for p99 foreground latency, changefeed throughput, and CPU usage.Release note: None
Epic: None