As seen in this comment, the metrics that we update frequently (e.g. the txn metrics) have a measurable cost probably because of their internal mutual exclusion. In particular, the histograms seems to be expensive.
The PR linked above avoids updating a transaction restarts histogram in the common case when transactions don't restart. Such histograms would benefit from an optimized path for an "expected value" where they should be as cheap as a counter.
Beyond that, we could try techniques for eliminating mutual exclusion - like maintaining metrics per PCPU and aggregating them on demand. I think I've heard RocksDB does this for their stats.
Jira issue: CRDB-4826
As seen in this comment, the metrics that we update frequently (e.g. the txn metrics) have a measurable cost probably because of their internal mutual exclusion. In particular, the histograms seems to be expensive.
The PR linked above avoids updating a transaction restarts histogram in the common case when transactions don't restart. Such histograms would benefit from an optimized path for an "expected value" where they should be as cheap as a counter.
Beyond that, we could try techniques for eliminating mutual exclusion - like maintaining metrics per PCPU and aggregating them on demand. I think I've heard RocksDB does this for their stats.
Jira issue: CRDB-4826