Skip to content

storage: Load-based rebalancing should be more willing to make certain incremental improvements #31135

@a-robinson

Description

@a-robinson

This is already called out in a TODO in the code:

// TODO(a-robinson): Support more incremental improvements -- move what we
// can if it makes things better even if it isn't great. For example,
// moving one of the other existing replicas that's on a store with less
// qps than the max threshold but above the mean would help in certain
// locality configurations.

But yesterday I saw a cluster get into a real situation where a greater willingness to move a replica that would make another store overfull would have been the ideal thing to do / what a human would have done. Currently the code is cautious about moving replicas with so many QPS that moving them over-fills whichever store they get moved to. That's a good decision when there's only one such replica on a store, but when there's two it's less good:

I181009 02:36:43.678116 996 storage/store_rebalancer.go:686  [n27,s28,store-rebalancer] r216919's 2456.46 qps would push s13 over the mean (168.26) with 2531.33 qps afterwards
I181009 02:36:43.678126 996 storage/store_rebalancer.go:597  [n27,s28,store-rebalancer] couldn't find enough rebalance targets for r216919 (2/3)
I181009 02:36:43.678723 996 storage/store_rebalancer.go:680  [n27,s28,store-rebalancer] r216912's 2338.62 qps would push s49 over the max threshold (268.26) with 2353.85 qps afterwards
I181009 02:36:43.678730 996 storage/store_rebalancer.go:597  [n27,s28,store-rebalancer] couldn't find enough rebalance targets for r216912 (2/3)
I181009 02:36:43.678738 996 storage/store_rebalancer.go:307  [n27,s28,store-rebalancer] ran out of replicas worth transferring and qps (4953.34) is still above desired threshold (268.26); will check again soon

In these logs, s28 has way more qps than the mean, mostly accounted for by two hot replicas (IIRC they were hot due to a bunch of retried errors, but that's a separate concern). However, moving either of them fails the check in shouldNotMoveTo that attempts to avoid ping-ponging a lone hot lease around between stores on an otherwise lightly loaded cluster:

newCandidateQPS := storeDesc.Capacity.QueriesPerSecond + replWithStats.qps
if storeDesc.Capacity.QueriesPerSecond < minQPS {
if newCandidateQPS > maxQPS {
log.VEventf(ctx, 3,
"r%d's %.2f qps would push s%d over the max threshold (%.2f) with %.2f qps afterwards",
replWithStats.repl.RangeID, replWithStats.qps, candidateStore, maxQPS, newCandidateQPS)
return true
}
} else if newCandidateQPS > meanQPS {
log.VEventf(ctx, 3,
"r%d's %.2f qps would push s%d over the mean (%.2f) with %.2f qps afterwards",
replWithStats.repl.RangeID, replWithStats.qps, candidateStore, meanQPS, newCandidateQPS)
return true
}

In such cases it would be reasonable to make such a move if, for example, the source store would still have more QPS on it than the destination store.

It's worth noting that it's better for 2.1 to be overly cautious than for it to be overly aggressive, so I'm not planning on pushing for any such changes to be backported.

gz#8871

Metadata

Metadata

Assignees

Labels

A-kv-distributionRelating to rebalancing and leasing.C-performancePerf of queries or internals. Solution not expected to change functional behavior.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions