Skip to content

kvserver: store-rebalancer can get blocked on load-based replica rebalances #79249

@aayushshah15

Description

@aayushshah15

The StoreRebalancer goroutine synchronously executes load-based lease transfers and load-based replica rebalances of the hottest ranges in a loop.

This means that, when a cluster is under duress and load-based replica rebalancing is taking a ~large amount of time, this can block the store rebalancer goroutine (blocking cheaper actions like load-based lease transfers) for an inordinate amount of time until the AdminRelocateRange call for each "hot range" to be processed either fails or hits its timeout. In other words, if the StoreRebalancer tries to rebalance away 1 replica each for a 100 ranges, and those rebalances are bound to hit their timeout, we won't see any load-based rebalancing on this store for a ~100minutes at a minimum.

We noticed this during an escalation where a single store on a hot node couldn't shed its load away because of this. The logs indicated that the StoreRebalancer goroutine was simply blocked on a ton of AdminRelocateRange calls that were eventually timing out:
image

Nodes 173 and 159 ^ were both nodes that had extremely high read amp during this incident.

@cockroachdb/kv-notifications

Jira issue: CRDB-14656

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-kv-distributionRelating to rebalancing and leasing.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-kvKV Team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions