admission: Split stats for WorkQueueMetrics by andrewbaptist · Pull Request #88076 · cockroachdb/cockroach

andrewbaptist · 2022-09-16T22:07:13Z

This commit splits the WorkQueueMetric stats into priority.
Because there are a lot of new stats that might be generated
as a result of this splitting, only the "important" priorities
are collected and considered for each queue.

Release note: None

cockroach-teamcity · 2022-09-16T22:07:21Z

This change is

sumeerbhola

Reviewed 5 of 9 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist and @irfansharif)

pkg/server/server.go line 222 at r1 (raw file):

		admissionOptions.Override(opts)
	}
	admissionOptions.Settings = st

why this change to pull out settings from the options?

pkg/ts/catalog/chart_catalog.go line 3473 at r1 (raw file):

					"admission.errored.kv.0",
					"admission.errored.kv.100",
					"admission.errored.sql-kv-response.0",

is this 0 and 100 the priority values?
I am wondering why we've picked these to populate here and not the others in the admissionpb.WorkPriority enum.

pkg/util/admission/grant_coordinator.go line 158 at r1 (raw file):

	// This is IO work, so override the usesTokens value.
	opts.usesTokens = true
	// TODO(sumeer): add per-store WorkQueue state for debug.zip and db console.

this revised code is also sharing the WorkQueue metrics across all stores, yes?
This would be a good thing to fix in a followup PR.

pkg/util/admission/grant_coordinator.go line 970 at r1 (raw file):

// Prevent the linter from emitting unused warnings.
var _ = NewGrantCoordinatorSQL

why is this removed?

pkg/util/admission/work_queue.go line 1598 at r1 (raw file):

	// TODO(abaptist): This is done to pre-register stats. Need to check that we
	// getOrCreate "enough" of the priorities to be useful.
	for _, pri := range applicablePriorities {

if the lazy creation is working correctly, this eager creation seems unnecessary. Is is needed for the list in chart_catalog?

pkg/kv/kvserver/client_split_test.go line 1224 at r1 (raw file):

			// Set maxBytes to something small so we can exceed the maximum split
			// size without adding 2x64MB of data.
			const maxBytes = 1 << 17

why this change?

andrewbaptist

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @sumeerbhola)

pkg/server/server.go line 222 at r1 (raw file):

Previously, sumeerbhola wrote…

why this change to pull out settings from the options?

There was inconsistency in the NewGrantCoordinator in terms of how it processed the settings. This wasn't strictly necessary, however, it made it harder to follow the code in terms of where the settings vs options came from. The goal was really to have more immutable options and settings coming through this code to make reasoning and testing easier. I don't mind reverting this part of the patch (or pulling into another commit).

pkg/ts/catalog/chart_catalog.go line 3473 at r1 (raw file):

Previously, sumeerbhola wrote…

is this 0 and 100 the priority values?
I am wondering why we've picked these to populate here and not the others in the admissionpb.WorkPriority enum.

These were the values that were most commonly used (0, 100 and -30). Once lazy loading is in place in the future, ideally this won't be necessary to do.

pkg/util/admission/grant_coordinator.go line 158 at r1 (raw file):

Previously, sumeerbhola wrote…

this revised code is also sharing the WorkQueue metrics across all stores, yes?
This would be a good thing to fix in a followup PR.

This comment seemed out of place after this chance the metrics were more obviously inserted as part of the makeStoreRequesterFunc.

pkg/util/admission/grant_coordinator.go line 970 at r1 (raw file):

Previously, sumeerbhola wrote…

why is this removed?

Not removed, just moved up to right before the method to make it more obvious what it was referring to. The NewGrantCoordinatorSQL method will need a little work before it is enabled, and it took me a while to understand how it worked (since it was repeating metrics names) as I didn't see that it was unused.

pkg/util/admission/work_queue.go line 1598 at r1 (raw file):

Previously, sumeerbhola wrote…

if the lazy creation is working correctly, this eager creation seems unnecessary. Is is needed for the list in chart_catalog?

lazy creation is working correctly from the metrics perspective and they get registered, unfortunately the registry does not currently handle lazy addtion (see https://cockroachlabs.slack.com/archives/C01CNRP6TSN/p1663779962687259 )

pkg/kv/kvserver/client_split_test.go line 1224 at r1 (raw file):

Previously, sumeerbhola wrote…

why this change?

The two tests started failing after making the change to stats. This test initializes a system with a greatly reduced maxBytes and then attempts to create a maxBytes size value and check what happens on split. With the new stats, we push the metrics value over the 64KB limit and therefore this test fails because it also gets added to the split queue. The 1 <<16 was somewhat arbitrary, so this change simply makes it large enough so the metrics range is no longer included.

sumeerbhola

Reviewed 4 of 9 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andrewbaptist, @irfansharif, and @sumeerbhola)

-- commits line 7 at r1:
Please add something like
Informs #82743

pkg/server/server.go line 222 at r1 (raw file):

Previously, andrewbaptist (Andrew Baptist) wrote…

There was inconsistency in the NewGrantCoordinator in terms of how it processed the settings. This wasn't strictly necessary, however, it made it harder to follow the code in terms of where the settings vs options came from. The goal was really to have more immutable options and settings coming through this code to make reasoning and testing easier. I don't mind reverting this part of the patch (or pulling into another commit).

btw, I don't think we have a principle regarding separating immutable and mutable configuration -- we tend to pile them all into Config/Options structs. Maybe kv code is principled in this regard, in which case I'd also like to know what the idiom is. One example is base.StorageConfig which includes cluster.Settings.

But I am ok with your change.

pkg/ts/catalog/chart_catalog.go line 3473 at r1 (raw file):

Previously, andrewbaptist (Andrew Baptist) wrote…

These were the values that were most commonly used (0, 100 and -30). Once lazy loading is in place in the future, ideally this won't be necessary to do.

So will we not see metrics for -30, -50, -100 for kv, kv-stores etc. given that the lazy registration does not work?
If yes, we should eagerly register them. One of the reasons we wanted to change is to use it when diagnosing workloads where we have user-facing stuff (typically priority 0) and various bulk operations running concurrently (and the latter will use -30 and -100).

pkg/util/admission/grant_coordinator.go line 158 at r1 (raw file):

Previously, andrewbaptist (Andrew Baptist) wrote…

This comment seemed out of place after this chance the metrics were more obviously inserted as part of the makeStoreRequesterFunc.

nit: makeStoreRequesterFunc is accepting metrics created in makeStoresGrantCoordinators and not creating them itself, so we need a TODO comment about per-store metrics somewhere in StoreGrantCoordinators. See my comment below about a better place.

pkg/util/admission/grant_coordinator.go line 401 at r1 (raw file):

	// These metrics are shared across all stores and all
	// priorities across stores (even the coarser workClasses, which are a
	// mapping from priority, share the same metrics).

Comment needs updating since the "all priorities" part is stale now. Can you add a TODO here about per-store metrics?

pkg/util/admission/grant_coordinator.go line 970 at r1 (raw file):

Previously, andrewbaptist (Andrew Baptist) wrote…

Not removed, just moved up to right before the method to make it more obvious what it was referring to. The NewGrantCoordinatorSQL method will need a little work before it is enabled, and it took me a while to understand how it worked (since it was repeating metrics names) as I didn't see that it was unused.

Ack

irfansharif · 2022-09-26T15:04:53Z

pkg/ts/catalog/chart_catalog.go

+					"admission.admitted.kv.0",
+					"admission.admitted.kv.100",
+					"admission.admitted.sql-kv-response.0",
+					"admission.admitted.sql-kv-response.100",


Instead of the numerical suffix, can we use the name of priority level instead? I can see us needing to slot in other priorities numerically without wanting that to reflect in different metric names/updating UI/warranting a release note. It's also easier to understand what these specific metrics represent without needing to look at code.

irfansharif · 2022-09-26T15:08:45Z

pkg/util/admission/granter_test.go

 		case "init-store-grant-coordinator":
 			clearRequesterAndCoord()
 			metrics := makeGrantCoordinatorMetrics()
+			workQueuMetrics := makeWorkQueueMetrics("", registry)


irfansharif · 2022-09-26T15:09:40Z

pkg/util/admission/work_queue.go

+// getOrCreate will return the metric if it exists or create it and then return
+// it if it didn't previously exist.
+func (m *WorkQueueMetrics) getOrCreate(priority admissionpb.WorkPriority) workQueueMetricsSingle {
+	// Try loading from the map first. This will not fin


Incomplete sentence.

irfansharif · 2022-09-26T15:14:50Z

pkg/util/admission/work_queue.go

+		statPrefix := fmt.Sprintf("%v.%v", m.name, priority)
+		val, ok = m.byPriority.LoadOrStore(priority, makeWorkQueueMetricsSingle(statPrefix))
+		if !ok {
+			m.registry.AddMetricStruct(val)


With this lazy registration -- do these metrics show up in CRDB's internal timeseries database? I'm asking whether it works with everything under pkg/server/status/recorder.go since a lot of that code was written with upfront registration in mind.

irfansharif · 2022-09-26T15:16:17Z

pkg/util/admission/work_queue.go

+	}
+	// TODO(abaptist): This is done to pre-register stats. Need to check that we
+	// getOrCreate "enough" of the priorities to be useful.
+	for _, pri := range applicablePriorities {


So it seems we're doing pre-registration here anyway. Do we need the LoadOrStore above when finding the right metric? If everything is properly pre-registered, shouldn't we fatal if we don't find something in the sync.Map?

andrewbaptist

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif and @sumeerbhola)

-- commits line 7 at r1:

Previously, sumeerbhola wrote…

Please add something like
Informs #82743

Done

pkg/ts/catalog/chart_catalog.go line 3468 at r1 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Instead of the numerical suffix, can we use the name of priority level instead? I can see us needing to slot in other priorities numerically without wanting that to reflect in different metric names/updating UI/warranting a release note. It's also easier to understand what these specific metrics represent without needing to look at code.

Done

pkg/ts/catalog/chart_catalog.go line 3473 at r1 (raw file):

Previously, sumeerbhola wrote…

So will we not see metrics for -30, -50, -100 for kv, kv-stores etc. given that the lazy registration does not work?
If yes, we should eagerly register them. One of the reasons we wanted to change is to use it when diagnosing workloads where we have user-facing stuff (typically priority 0) and various bulk operations running concurrently (and the latter will use -30 and -100).

Done, I added some additional stats, I am attempting to avoid overloading the number of stats. Having a few missed stats isn't too bad because they will fall into the "default" bucket.

pkg/util/admission/grant_coordinator.go line 158 at r1 (raw file):

Previously, sumeerbhola wrote…

nit: makeStoreRequesterFunc is accepting metrics created in makeStoresGrantCoordinators and not creating them itself, so we need a TODO comment about per-store metrics somewhere in StoreGrantCoordinators. See my comment below about a better place.

Done

pkg/util/admission/grant_coordinator.go line 401 at r1 (raw file):

Previously, sumeerbhola wrote…

Comment needs updating since the "all priorities" part is stale now. Can you add a TODO here about per-store metrics?

Done

pkg/util/admission/grant_coordinator.go line 970 at r1 (raw file):

Previously, sumeerbhola wrote…

Ack

Done

pkg/util/admission/work_queue.go line 1526 at r1 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Incomplete sentence.

Done

pkg/util/admission/work_queue.go line 1537 at r1 (raw file):

Previously, irfansharif (irfan sharif) wrote…

With this lazy registration -- does these metrics show up in CRDB's internal timeseries database? I'm asking whether it works with everything under pkg/server/status/recorder.go since a lot of that code was written with upfront registration in mind.

No - the lazy registration today does not handle either the internal or prometheus database. I hope we can remove the pre-registration in the future, but I didn't dive into that code now.

pkg/util/admission/work_queue.go line 1598 at r1 (raw file):

Previously, irfansharif (irfan sharif) wrote…

So it seems we're doing pre-registration here anyway. Do we need the LoadOrStore above when finding the right metric? If everything is properly pre-registered, shouldn't we fatal if we don't find something in the sync.Map?

We don't currently pre-register everything so this shouldn't be fatal. I want to remove pre-registration in the future, but didn't want to try and do this as part of this commit. I kept LoadOrStore for that reason.

pkg/util/admission/granter_test.go line 124 at r1 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Typo.

Done

sumeerbhola

Reviewed 4 of 5 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andrewbaptist, @irfansharif, and @sumeerbhola)

pkg/util/admission/work_queue.go line 1529 at r3 (raw file):

	Admitted        *metric.Counter
	Errored         *metric.Counter
	WaitDurationSum *metric.Counter

You are removing this because we don't find it useful, yes?
My original motivation was to be able to look at mean latency, but it is not possible to compute delta(waitDurationSum)/delta(requested) in our internal metrics system, though it it possible with Prometheus.
I'll leave it to @irfansharif and your judgement on whether this is useful or not.

andrewbaptist

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @irfansharif and @sumeerbhola)

pkg/util/admission/work_queue.go line 1529 at r3 (raw file):

Previously, sumeerbhola wrote…

You are removing this because we don't find it useful, yes?
My original motivation was to be able to look at mean latency, but it is not possible to compute delta(waitDurationSum)/delta(requested) in our internal metrics system, though it it possible with Prometheus.
I'll leave it to @irfansharif and your judgement on whether this is useful or not.

This is automatically computed as part of the WaitDurations histogram, and the values are the same. So this was a redundant state. I'm not sure about the internal metrics, but I assume they could use the sum metric. Prometheus can definitely use the histogram counter to get the sum

This commit splits the WorkQueueMetric stats into priority. Because there are a lot of new stats that might be generated as a result of this splitting, only the "important" priorities are collected and considered for each queue. Informs cockroachdb#82743. Release note: None

irfansharif

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andrewbaptist, @irfansharif, and @sumeerbhola)

pkg/ts/catalog/chart_catalog.go line 3463 at r4 (raw file):

					"admission.admitted.elastic-cpu",
					"admission.errored.elastic-cpu",
					"admission.admitted.elastic-cpu.bulk",

Optional nit: add the "-pri" suffix so it's even clearer.

pkg/util/admission/work_queue.go line 1529 at r3 (raw file):

Previously, andrewbaptist (Andrew Baptist) wrote…

This is automatically computed as part of the WaitDurations histogram, and the values are the same. So this was a redundant state. I'm not sure about the internal metrics, but I assume they could use the sum metric. Prometheus can definitely use the histogram counter to get the sum

cockroach/pkg/server/status/recorder.go

Lines 532 to 541 in e397faf

    
           n := float64(mtr.TotalCountWindowed()) 
        
           fn(name+"-count", n) 
        
           avg := mtr.TotalSumWindowed() / n 
        
           if math.IsNaN(avg) || math.IsInf(avg, +1) || math.IsInf(avg, -1) { 
        
           	avg = 0 
        
           } 
        
           fn(name+"-avg", avg) 
        
           for _, pt := range recordHistogramQuantiles { 
        
           	fn(name+pt.suffix, mtr.ValueAtQuantileWindowed(pt.quantile)) 
        
           }

The sum is missing, but the avg is computed already. So this is redundant and can be removed.

andrewbaptist · 2022-09-27T19:51:32Z

bors r=irfansharif

craig · 2022-09-27T21:07:35Z

Build succeeded:

Bazel Essential CI (Cockroach)

andreimatei · 2022-09-30T20:26:21Z

This patch increased the number of timeseries exported to tsdb by crdb nodes from 731 to 1739. If there's histograms in there, it might be even much worse for Prometheus.
I don't think we can afford anything like it...

andrewbaptist · 2022-09-30T20:40:07Z

I was concerned about the impact of this change. I can easily disable all of these and will hold off on any backporting. I might need to rethink if there is a way to get the data we want without exploding stats.

andreimatei · 2022-09-30T20:49:09Z

I think I fooled myself when counting, sorry. I now think that this patch only took the timeseries count from 1543 to 1739. So, now quite as bad as I thought. Still significant, though.

sumeerbhola

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)

pkg/util/admission/work_queue.go line 1529 at r3 (raw file):

Previously, irfansharif (irfan sharif) wrote…

cockroach/pkg/server/status/recorder.go

Lines 532 to 541 in e397faf

n := float64(mtr.TotalCountWindowed())

fn(name+"-count", n)

avg := mtr.TotalSumWindowed() / n

if math.IsNaN(avg) || math.IsInf(avg, +1) || math.IsInf(avg, -1) {

avg = 0

}

fn(name+"-avg", avg)

for _, pt := range recordHistogramQuantiles {

fn(name+pt.suffix, mtr.ValueAtQuantileWindowed(pt.quantile))

}

The sum is missing, but the avg is computed already. So this is redundant and can be removed.

The histogram reporting is limited to work that actually waited in the queue. So if a large fraction did not wait in the queue, it will overreport the latency percentiles. We should fix that by reporting everything. I did not do this when writing the first version of this code, because I was worried about lock contention overhead in the histogram and did not have time to investigate how much of a problem it really was.

Previously, `AggHistogram` instances would not persist their quantiles to tsdb due to a missing interface implementation of `metrics.WindowedHistogram`. This PR adds a trivial implementation that delegates to the aggregate histogram instance within the struct. This is relatively safe to do even though an `AggHistogram` could contain many children because we are only exporting a single set of aggregate quantiles per-`AggHistogram`. The children are only iterated over via the `PrometheusIterable` interface which is used by the prometheus exporter, but not by the metrics recorder. NOTE: This backported commit includes a test fix to the `kv` package that was lifted from cockroachdb#88076. Release note (bug fix, ops change): Previously, certain aggregate histograms would appear in `_status/vars` but not be available for graphing in the DB Console. These are now made available. They include changefeed-related histograms, and row-level-TTL histograms. Epic: None

andrewbaptist force-pushed the work_queue_stats branch 15 times, most recently from f4ed0ac to 07bf2d7 Compare September 21, 2022 19:35

irfansharif mentioned this pull request Sep 21, 2022

admission: additional observability #82743

Closed

22 tasks

andrewbaptist force-pushed the work_queue_stats branch 3 times, most recently from 2389725 to 905c884 Compare September 22, 2022 15:49

andrewbaptist marked this pull request as ready for review September 22, 2022 17:39

andrewbaptist requested a review from a team as a code owner September 22, 2022 17:39

andrewbaptist requested a review from a team September 22, 2022 17:39

andrewbaptist requested a review from a team as a code owner September 22, 2022 17:39

andrewbaptist requested review from a team, irfansharif and sumeerbhola September 22, 2022 17:39

sumeerbhola requested changes Sep 25, 2022

View reviewed changes

andrewbaptist commented Sep 26, 2022

View reviewed changes

sumeerbhola approved these changes Sep 26, 2022

View reviewed changes

irfansharif reviewed Sep 26, 2022

View reviewed changes

andrewbaptist force-pushed the work_queue_stats branch from 905c884 to c43031a Compare September 26, 2022 19:30

blathers-crl bot requested a review from irfansharif September 26, 2022 19:30

andrewbaptist commented Sep 26, 2022

View reviewed changes

andrewbaptist force-pushed the work_queue_stats branch from c43031a to 2450e8e Compare September 27, 2022 13:06

sumeerbhola reviewed Sep 27, 2022

View reviewed changes

andrewbaptist commented Sep 27, 2022

View reviewed changes

andrewbaptist force-pushed the work_queue_stats branch from 2450e8e to 0d3e209 Compare September 27, 2022 18:55

irfansharif approved these changes Sep 27, 2022

View reviewed changes

craig bot merged commit 471f9b6 into cockroachdb:master Sep 27, 2022

sumeerbhola reviewed Oct 4, 2022

View reviewed changes

dhartunian mentioned this pull request Nov 7, 2022

release-22.2.0: metrics: add tsdb persistence to AggHistogram #91410

Merged

andrewbaptist deleted the work_queue_stats branch December 15, 2023 21:33

	n := float64(mtr.TotalCountWindowed())
	fn(name+"-count", n)
	avg := mtr.TotalSumWindowed() / n
	if math.IsNaN(avg) \|\| math.IsInf(avg, +1) \|\| math.IsInf(avg, -1) {
	avg = 0
	}
	fn(name+"-avg", avg)
	for _, pt := range recordHistogramQuantiles {
	fn(name+pt.suffix, mtr.ValueAtQuantileWindowed(pt.quantile))
	}

Conversation

andrewbaptist commented Sep 16, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cockroach-teamcity commented Sep 16, 2022

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

andrewbaptist left a comment

Choose a reason for hiding this comment

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

irfansharif Sep 26, 2022

Choose a reason for hiding this comment

Uh oh!

irfansharif Sep 26, 2022

Choose a reason for hiding this comment

Uh oh!

irfansharif Sep 26, 2022

Choose a reason for hiding this comment

Uh oh!

irfansharif Sep 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

irfansharif Sep 26, 2022

Choose a reason for hiding this comment

Uh oh!

andrewbaptist left a comment

Choose a reason for hiding this comment

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

andrewbaptist left a comment

Choose a reason for hiding this comment

Uh oh!

irfansharif left a comment

Choose a reason for hiding this comment

Uh oh!

andrewbaptist commented Sep 27, 2022

Uh oh!

craig bot commented Sep 27, 2022

Uh oh!

andreimatei commented Sep 30, 2022

Uh oh!

andrewbaptist commented Sep 30, 2022

Uh oh!

andreimatei commented Sep 30, 2022

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

andrewbaptist commented Sep 16, 2022 •

edited

Loading

irfansharif Sep 26, 2022 •

edited

Loading