kvserver: request stage metrics

**Is your feature request related to a problem? Please describe.**

As a `BatchRequest` arrives at a `Store` and gets processed by a `Replica`, it passes through multiple [stages of execution](https://github.com/cockroachdb/cockroach/blob/4907da5c1b5ad9f257dba52aef42e5da653290e4/pkg/kv/kvserver/replica_send.go#L42-L99).

Currently, the KV latency metrics we have all close over contention, so we can never quite conclusively blame or de-blame KV for perceived slowness of an aggregate workload: it is possible that there is contention, in which case the KV latency metrics are expected to be elevated. But it's also possible that the slowness arises from, say, an overloaded pebble instance.

In effect, one needs to check all possible sources of slowness individually but there is no easy way to do so since we don't divvy up how a request spends it time. For example, we don't track how long a request takes to evaluate, which is mostly a function of CPU and I/O. Slow evaluation leads to slow latching and more contention, etc, so it is really slow evaluation that one would want to know about first. But we don't track at that kind of granularity.

**Describe the solution you'd like**

A breakdown of the phases is described in this comment:

https://github.com/cockroachdb/cockroach/blob/4907da5c1b5ad9f257dba52aef42e5da653290e4/pkg/kv/kvserver/replica_send.go#L42-L99

I'd like to standardize on a set of phases that we are going to measure and ideally we measure them always (regardless of tracing), so that we can populate metrics. Some of the phases are going to be very straightforward (for example, time spent waiting for admission control, time spent latching, time spent evaluating) while others are more subtle (time spent replicating - this is just hard to set up since most requests return early, before they are replicated). We don't have to get everything sorted out in the first pass, but should do the obvious phases, with everything else falling into an "unaccounted" bucket. It's kind of annoying to actually properly build an unaccounted bucket, but I think we don't really have to - we have the existing metric that tracks the entire duration:

https://github.com/cockroachdb/cockroach/blob/32622e1b18030bd52529841ec1bb280a5683d5cb/pkg/server/node.go#L1014

so (at least in prometheus) we can subtract from that metric all of the phases to get a good idea of whether the "time between phases" is significant.

**Describe alternatives you've considered**

**Additional context**

#71169 E2E latency
https://github.com/cockroachdb/cockroach/issues/82203 extension of this issue to also record the stage latencies on a per-request basis.


Jira issue: CRDB-16246

	// Send executes a command on this range, dispatching it to the
	// read-only, read-write, or admin execution path as appropriate.
	// ctx should contain the log tags from the store (and up).
	//
	// A rough schematic for the path requests take through a Replica
	// is presented below, with a focus on where requests may spend
	// most of their time (once they arrive at the Node.Batch endpoint).
	//
	// DistSender (tenant)
	// │
	// ┆ (RPC)
	// │
	// ▼
	// Node.Batch (host cluster)
	// │
	// ▼
	// Admission control
	// │
	// ▼
	// Replica.Send
	// │
	// Circuit breaker
	// │
	// ▼
	// Replica.maybeBackpressureBatch (if Range too large)
	// │
	// ▼
	// Replica.maybeRateLimitBatch (tenant rate limits)
	// │
	// ▼
	// Replica.maybeCommitWaitBeforeCommitTrigger (if committing with commit-trigger)
	// │
	// read-write ◄─────────────────────────┴────────────────────────► read-only
	// │ │
	// │ │
	// ├─────────────► executeBatchWithConcurrencyRetries ◄────────────┤
	// │ (handles leases and txn conflicts) │
	// │ │
	// ▼ │
	// executeWriteBatch │
	// │ │
	// ▼ ▼
	// evalAndPropose (turns the BatchRequest executeReadOnlyBatch
	// │ into pebble WriteBatch)
	// │
	// ├──────────────────► (writes that can use async consensus do not
	// │ wait for replication and are done here)
	// │
	// ├──────────────────► maybeAcquireProposalQuota
	// │ (applies backpressure in case of
	// │ lagging Raft followers)
	// │
	// │
	// ▼
	// handleRaftReady (drives the Raft loop, first appending to the log
	// to commit the command, then signaling proposer and
	// applying the command)
	func (r *Replica) Send(

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvserver: request stage metrics #82200

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kvserver: request stage metrics #82200

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions