-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Feature Request: query counts and timings by tablet type #13546
Description
Feature Description
I propose that we add two new metrics that mirror QueryCount and QueryTimings except they have a tablet_type label.
Use Case(s)
It would be useful for us to be able to see some essential stats broken down by tablet type. It's nice, for example, to know the number of queries executed on your primary vs replicas when monitoring overall system health.
We're currently doing this by adding a tablet type label downstream in our metrics pipeline, but this is inaccurate when the tablet type changes as a result of reparenting and the underlying counters are not reset. When tablet type changes after a reparent, we effectively see a new series with the existing counts and PromQL functions that are based on a derivative (i.e. increase or rate) report huge values for the first time bucket. Labeling metrics with tablet_type at the source avoids this problem because we have distinct counters for each tablet type.
We can add new counters with the labels (instead of adding them to existing metrics) to avoid breaking changes.