-
Notifications
You must be signed in to change notification settings - Fork 4.1k
kvserver: allocating replacement replicas needs to consider fully satisfying constraints #94809
Copy link
Copy link
Closed
Labels
A-kv-decom-rolling-restartDecommission and Rolling RestartsDecommission and Rolling RestartsA-kv-distributionRelating to rebalancing and leasing.Relating to rebalancing and leasing.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-kvKV TeamKV Team
Description
In a recent investigation, it was discovered that when we have constraints applying to some but not all the replicas needed for a range, it is possible for a replacement operation (such as during decommission) to not consider that all constraints are no longer satisfied. This occurs when we have configurations such as num_replicas = 3, constraints = '{<some constraint>: 1}', and thus would expect to have 2 replicas that do not need to satisfy any constraints, known as "unconstrained replicas"; however replacement of the one replica that satisfies the constraint should not be possible.
This can be reproduced simply with the following:
roachprod create local -n4
roachprod stage local release v22.1.12
roachprod start local --racks=4
roachprod ssh local:1 -- './cockroach workload init kv --splits=100'
roachprod sql local:1 -- -e "alter database kv configure zone using num_replicas=3, constraints='{+rack=3: 1}';"
# wait for rebalancing
roachprod ssh local:1 -- './cockroach node decommission 4 --insecure'
# this should not succeed, but it does
Jira issue: CRDB-23152
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
A-kv-decom-rolling-restartDecommission and Rolling RestartsDecommission and Rolling RestartsA-kv-distributionRelating to rebalancing and leasing.Relating to rebalancing and leasing.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.T-kvKV TeamKV Team