Skip to content

kv: Gossip additional store health and load signals #77604

@kvoli

Description

@kvoli

Is your feature request related to a problem? Please describe.

Currently, allocator and store rebalancer decisions are made using measurements of the replicas the store contains and information of other stores that is gossipped around.

These measurements currently comprise:

store (gossiped)

# usage
queries per second (batch requests)
writes per second (keys mutated)
ranges            (on the store)

# health
disk fullness (%)
liveness (not dead or suspected)

# locality
location

replica (not gosipped)

# usage
queries per second (batch request)
writes per second (keys mutated)

It is desirable to enable experimentation with additional measurements, to be used as inputs in allocation and rebalancing decisions.

Describe the solution you'd like

Gossip additional store load and health signals; so that they may be used as inputs for allocation decisions. Previous work has included a subset of health signals #77040. This issue proposes including these additional measurements, to be used as signals of store load and health:

store (gossiped)

# usage
cpu use (utilization % * available cores, EWMA or percentiles 50/67/95/99)
read throughput (bytes, EWMA or percentiles 50/67/95/99)
write throughput (bytes, EWMA or percentiles 50/67/95/99)

# health
L0 sub-levels
disk stalls (number of disk operations taking longer than 30s, sliding window)
disk slow (number of disk operations taking longer than 10s, sliding window)

Gossiping these measurements of store load and health may then be used as inputs in evaluating other allocation/rebalancing strategies compared to current. Related to #70552.

Describe alternatives you've considered

Avoid gossiping any new values until we have a corresponding issue for their inclusion in allocation or rebalancing decisions. Currently only a portion of the store health statistics have immediate uses as constraints to avoid considering an unhealthy store as a replication target.

This issue could in addition include:

reads per second (keys read)

Additional context

Additional work is being planned to enable quick testing of allocation strategies #70552, this issue compliments that work by providing an expanded set of inputs in order to evaluate strategies that depend upon resource level signals (cpu/throughput), rather than proxies (qps).

cc: @shralex @irfansharif @lidorcarmel

Jira issue: CRDB-13676

Epic CRDB-14989

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-kvAnything in KV that doesn't belong in a more specific category.A-kv-distributionRelating to rebalancing and leasing.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-kvKV Team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions