Skip to content

kvserver: quota pool observability #79756

@tbg

Description

@tbg

Is your feature request related to a problem? Please describe.

We know of several instances where the behavior of the quota pool was poorly understood.
The quota pool exports no metrics, nor is there a way to pull information from it proactively.

Common questions are:

  • is the quota pool throttling, and if so, how severely?
  • is a given quota pool ignoring followers, and which ones?

Describe the solution you'd like

Metrics:

  • Slow acquisitions, i.e. a Gauge that tracks all ongoing acquisitions that trigger
    logSlowRaftProposalQuotaAcquisition,
  • Quota pool wait times, i.e. a latency histogram for which each Acquire() call records its latency. This provides a measure of how pervasively the quota pool throttles across the replicas.
  • Quota pool allocations, i.e. a bytes histogram into which each allocation (by any replicaID) is recorded. The rate of the sum effectively provides a throughput for the range, and also has the breakdown of raft proposal sizes.
  • the number of times this code ignores a follower for the purposes of proposal quota enforcement, with one counter metric (or prometheus label) for each reason:
    • follower inactive (lastUpdateTimes check)
    • no healthy conn (ConnHealth check)
    • below base index

Observability via kvserverpb.RangeInfo:

  • ApproximateQuota() (already there)
  • Capacity
  • quota release queue length and base index
    // The base index is the index up to (including) which quota was already
    // released. That is, the first element in quotaReleaseQueue below is
    // released as the base index moves up by one, etc.
    proposalQuotaBaseIndex uint64
    // Once the leader observes a proposal come 'out of Raft', we add the size
    // of the associated command to a queue of quotas we have yet to release
    // back to the quota pool. At that point ownership of the quota is
    // transferred from r.mu.proposals to this queue.
    // We'll release the respective quota once all replicas have persisted the
    // corresponding entry into their logs (or once we give up waiting on some
    // replica because it looks like it's dead).
    quotaReleaseQueue []*quotapool.IntAlloc
  • the current result of this logic as applied to each follower, i.e. a slice [replicaID, ignoreReason] mirroring the requested metrics above.

Problem ranges:

  • highlight slow proposal quota (>15s) as this is definitely a problem. (Empty quota is not a problem - it is the expected steady state when a follower is slightly slower than the rest)

Logging:

  • over each (say) 10s interval, save (by distinct rangeID+replicaID) up to (say) ten raft statuses (in a store-wide map) for which the quota pool did not consider a follower for release, i.e. every ~10 seconds print a message (if there is anything to print, which in the common case there is not)

quota pool not enforced for:
r100/5: follower_inactive base=150 status=[insert raft status]
r100/7: base_index base=160 status=[...]
r200/2: no_healthy_conn [...]

Describe alternatives you've considered

Additional context

The above suggestion was written up quickly, and shouldn't be considered final or flawless.

Jira issue: CRDB-15931

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-kvAnything in KV that doesn't belong in a more specific category.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-kvKV Team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions