Skip to content

allocator: voter constraint never satisfied when there are a correct number of replicas and all existing replicas are necessary #106559

@kvoli

Description

@kvoli

Describe the problem

When there are the correct number of voters/non-voters and all existing replicas are necessarily satisfying some constraint, the allocator may not satisfy any remaining constraints - even though it should.

Consider the example from #98020

Node Localities

# region -> node_id
eu-west-1
  └── [8 9 10]
us-central-1
  └── [5 6 7]
us-east-1
  └── [1 2]
us-west-1
  └── [3 4]

Span config:

num_replicas=6 
num_voters=5 
constraints={'+region=eu-west-1':1,'+region=us-central-1':1,'+region=us-east-1':1,'+region=us-west-1':1} 
voter_constraints={'+region=us-west-1':2,'+region=us-east-1':2}

Existing replicas:

voters=[n1,n2,n4,n5,n8]
non_voters=[n3]

There are the correct number of voters/replicas. All existing replicas are necessary (from allocators perspective) to satisfy the existing constraints—including the non-voter on n3.

n3 should be promoted to a voter, then either n5 (us-central-1) or n8 (eu-west-1) should be demoted to a non-voter. i.e., a rebalance with a promotion and demotion.

However this will never occur, as we get stuck on:

if !cand.less(existing) {
// If `cand` is not worse than `existing`, add it to the list.
comparableCands = append(comparableCands, cand)
if !needRebalanceFrom && !needRebalanceTo && existing.less(cand) {

Which checks if the existing replica (necessary=true) is less than the replacement (necessary=true). If the replacement candidate inputs (diversity score) is not greater than the existing, we will never rebalance to satisfy the constraint.

To Reproduce

#106548 reproduces the no rebalance target behavior. Reproduction using the simulator and more directly when calling Allocator.Rebalance(Non)Voter.

Expected behavior

Rebalance action returned with a promotion and demotion.

Environment:
Reproduces on master. Assume all versions affected.

Jira issue: CRDB-29615

Metadata

Metadata

Assignees

Labels

A-kv-distributionRelating to rebalancing and leasing.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-kvKV Teambranch-masterFailures and bugs on the master branch.db-cy-23

Type

No type

Projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions