allocator/mmaprototype: implement MMA repair pipeline by tbg · Pull Request #164658 · cockroachdb/cockroach

tbg · 2026-03-02T14:45:39Z

Important

Start here: walkthrough.md — a linear, code-level walkthrough of the repair pipeline. Uses repairAddVoter as a representative example and traces the full data flow from mode gating through change application. Includes inline source snippets (verifiable via showboat verify).

Architecture & design brief: design-brief.md — how the repair pipeline works and how it fits into MMA.

Productionization plan: mma-repair-brief.md — proposed merge strategy (6 PRs) and what's in/out of scope.

Detailed PR split: how-to-productionize.md — per-PR breakdown with files, LOC estimates, and review focus areas.

To discuss:

how to implement count-based (lease and replica) rebalancing (which is done by the lease/replicate queues). In particular, count-based rebalancing in the SMA does not consider nodes "overfull" based on global means (which can be non-representative).
quorum check should come at top of computeRepairAction
metric gauges exposing count of replicas by repair enum status
whether and how to add IO overload shedding
prioritization mechanism between repair action: current "greedy by enum order" maybe not ideal
avoid busy-looping (consider enforcing some spacing) between mma iterations (or maybe not needed, but replicate/lease queue do it)
AdminScatter
TransferLease
metamorphic testing in unit tests

Summary

This PR implements the complete repair pipeline for the Multi-Metric
Allocator (MMA) prototype. Today, range repair (upreplication,
decommissioning, dead-node replacement, constraint enforcement) is
handled by the replicate queue. This PR teaches MMA to perform all of
these repairs itself, behind a new cluster setting mode
(LBRebalancingMultiMetricRepairAndRebalance). When that mode is active,
the replicate and lease queues become no-ops and MMA takes full
ownership of both rebalancing and repair.

Design & planning

The branch starts with design documents that map out the repair action
space and the constraint satisfaction logic, including how the legacy
allocator's scorer hierarchy maps onto MMA concepts.

Repair actions (12 total)

All twelve repair actions are implemented in
cluster_state_repair.go (~1,600 lines), organized into three groups:

Count-based: AddVoter (with non-voter promotion), RemoveVoter,
AddNonVoter, RemoveNonVoter, RemoveLearner,
FinalizeAtomicReplicationChange.
Dead/decommissioning replacement: ReplaceDeadVoter,
ReplaceDeadNonVoter, ReplaceDecommissioningVoter,
ReplaceDecommissioningNonVoter.
Constraint swaps: SwapVoterForConstraints,
SwapNonVoterForConstraints.

Each action is tested via the MMA DSL (repair, repair-needed
commands) with datadriven test files covering count-based, constraint,
and interaction scenarios.

Diversity picker

pickStoreByDiversity is generalized to accept a diversityScorer
function parameter, enabling reuse across add, remove, replace, and swap
paths. Random tiebreaking ensures non-deterministic selection among
equally-good candidates.

Integration

A new LBRebalancingMultiMetricRepairAndRebalance mode is added to the
LoadBasedRebalancingMode cluster setting. When active:

The replicate queue's shouldQueue/process return immediately.
The lease queue's shouldQueue/process return immediately.
IncludeRepair is passed to ComputeChanges, which calls repair()
before computing rebalancing changes.
Repair metrics are tracked separately from rebalance metrics.

Production fixes

Leave-joint handling: IsLeaveJoint() on ExternalRangeChange
routes leave-joint changes through maybeLeaveAtomicChangeReplicas
instead of changeReplicasImpl, which would fail.
ForceReplicationScanAndProcess: now delegates to
mmaStoreRebalancer.rebalanceUntilStable() when MMA
repair-and-rebalance mode is active, fixing WaitForFullReplication
and enabling ReplicationAuto in MMA tests.
TestMMAUpreplication: integration test verifying end-to-end
upreplication from 1 to 3 voters under MMA.

ASIM tests

The allocator simulator (ASIM) is extended to support MMA repair:
queue disabling under the new mode, same-store voter type transitions in
ReplicaChange.Apply, and golden-output repair test scenarios covering
upreplication, dead-node replacement, decommissioning, and constraint
enforcement.

Commits

Individual commits speak for themselves. They are grouped into phases
visible in the commit log: design docs, infrastructure, count-based
repairs, replacement repairs, constraint swaps, refactoring, integration,
ASIM tests, and production fixes.

Epic: CRDB-39508

trunk-io · 2026-03-02T14:45:43Z

Merging to master in this repository is managed by Trunk.

To merge this pull request, check the box to the left or comment /trunk merge below.

blathers-crl · 2026-03-02T14:45:45Z

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

cockroach-teamcity · 2026-03-02T14:46:45Z

This change is

Add a reference document mapping the responsibilities of the replicate queue, lease queue, store rebalancer, and MMA. This serves as the foundation for planning the absorption of the replicate and lease queues into MMA. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Document the lexicographic priority hierarchy used by the legacy allocator's candidate scoring (candidate.compare()), the different scorer variants, and how MMA's approach differs. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

MMA checks constraints but won't actively fix violations — it skips non-conformant ranges. Update the table to reflect this. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Document the approach for teaching MMA to detect and repair constraint violations: eager constraint analysis at StoreLeaseholderMsg processing time with caching, and a repair set processed with higher priority than load rebalancing. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Replace the TODOs in mma-constraint-repair.md with resolved design decisions: - Repair action ordering: plain enum replacing legacy numerical priorities, with simplified Remove actions and explicit constraint swap actions. - Repair operations: reuse existing MMA add/remove/replace primitives. - Pending change interaction: skip ranges with in-flight changes. - Store health/disposition interaction: targets require ReplicaDispositionOK; removal prefers Dead > Unknown > Unhealthy > Shedding > Refusing > healthy. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Three improvements to the cluster state test DSL to reduce boilerplate in upcoming repair tests: 1. `set-store quiet=true`: suppress the verbose node/store listing output. 2. Auto-assign replica IDs when `replica-id=` is omitted from replica lines in `store-leaseholder-msg`. A per-range counter starts at 1 and advances with each replica; explicit values update the counter to stay above them. 3. Relax the minimum field count for replica lines from 3 to 2 (store-id + type is sufficient when replica-id and leaseholder are omitted). Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add a `repair` method on `rebalanceEnv` that currently logs "not yet implemented" and returns nil. This will be filled in as the constraint repair logic is built out. Add a corresponding `repair` command to the TestClusterState datadriven DSL, following the same pattern as `rebalance-stores`: it creates a rebalanceEnv, calls repair, and outputs the trace plus pending changes. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add 10 datadriven test files exercising count-based repair scenarios. Each test sets up stores and ranges with specific replica configurations, then invokes `repair` and expects the stub "not yet implemented" output. These tests will be rewritten with `-rewrite` as the repair logic is implemented. Tests cover: finalizing atomic replication changes, removing learners, adding/removing voters and non-voters, and replacing dead or decommissioning replicas. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add 4 datadriven test files for constraint-based and interaction repair scenarios: - repair_swap_voter: voter misplaced relative to zone constraints - repair_swap_nonvoter: non-voter misplaced relative to zone constraints - repair_pending_skip: range with existing pending change is skipped - repair_range_unavailable: range without quorum (2 of 3 voters dead) Like the count-based tests, these currently expect the stub output and will be rewritten as repair logic is implemented. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add the RepairAction enum and its core computation method. RepairAction represents the highest-priority repair needed for a range, ordered so that lower enum values have higher priority. The zero value is intentionally invalid (iota + 1) to catch uninitialized fields. computeRepairAction inspects a range's replicas, store statuses, and constraint satisfaction to determine the single highest-priority action: joint config finalization, learner removal, voter/non-voter count adjustments (add, remove, replace dead/decommissioning), and constraint swaps. Ranges that have lost quorum or have pending changes return NoRepairNeeded (can't repair or already being repaired). Also adds updateRepairAction and removeFromRepairRanges helpers that maintain the repairRanges index (wired in the next commit). Epic: none Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add a repairAction field to rangeState and a repairRanges index to clusterState that maps RepairAction → set of range IDs. This allows repair() to iterate ranges needing repair by priority without scanning all ranges. Wire updateRepairAction calls at all trigger points where a range's repair status may change: - processRangeMsg (replicas or config changed) - updateStoreStatuses (store health changed) - addPendingRangeChange (pending change suppresses repair) - undoPendingChange (pending change removed) - pendingChangeEnacted (pending change completed) - range GC (range removed from tracking) The pendingChangeEnacted signature gains a context.Context parameter to support the updateRepairAction call, updated at all 3 call sites. Epic: none Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

…king tests Add the `repair-needed` DSL command to TestClusterState, which dumps the repairRanges index in priority order, showing which ranges need which repair action. Update all 14 existing repair test files to assert repair tracking after each state mutation. Add 6 new datadriven tests exercising the tracking lifecycle: - repair_tracking_status_change: store health transitions - repair_tracking_pending_lifecycle: pending change add/reject/enact - repair_tracking_config_change_with_pending: config change during pending - repair_tracking_multi_range: multiple ranges with different actions - repair_tracking_constraint_change: constraint satisfaction changes - repair_tracking_action_priority: action priority transitions Epic: none Release note: None Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Move `candidatesToConvertFromNonVoterToVoter()` and `constraintsForAddingVoter()` from `constraint_unused_test.go` to `constraint.go`. These methods are needed for the upcoming AddVoter repair action implementation. Also add `originMMARepair` to `ChangeOrigin` for tracking repair-originated changes separately from rebalance-originated ones. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Implement the `repair()` dispatch loop in `rebalanceEnv` and the first concrete repair action: `AddVoter`. When a range has fewer voters than its span config requires, repair selects a target store based on constraint satisfaction and diversity scoring, then creates a pending change to add a voter there. The repair loop iterates `repairRanges` in priority order (matching `RepairAction` enum ordering) and only repairs ranges where the local store is the leaseholder. Unimplemented actions log a message identifying the specific action. The `repairAddVoter` flow: 1. Analyze constraints for the range 2. Check for non-voter promotion candidates (TODO: implement promotion) 3. Find constraint-satisfying candidate stores 4. Filter by disposition, existing replicas, and node-level diversity 5. Pick the target with the best voter diversity score 6. Create and register the pending change The `repair_add_voter.txt` test is extended to verify the full lifecycle: repair creates a pending change, `repair-needed` confirms suppression, and after enactment via `store-leaseholder-msg` the range is healthy. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Start with a single voter (config says 3) so repair must add voters in two successive rounds. This exercises the full cycle twice: repair picks the best-diversity candidate, creates a pending change, the pending change suppresses further repair, enactment re-enables repair for the next round, and the second addition completes the range. Round 1 picks s2 over s3 (equal diversity, lower StoreID wins). Round 2 picks s3 as the only remaining candidate. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

…Voter repair When an AddVoter repair is needed and there are existing non-voters that could satisfy the voter constraint, promote one instead of adding a new replica on a fresh store. The best promotion candidate is chosen by voter diversity score (highest wins, ties broken by lower StoreID). Extract `pickBestStoreByVoterDiversity` helper to avoid duplicating the diversity-scoring loop between the add-new-voter and promote paths. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

pickBestStoreByVoterDiversity When multiple candidate stores have equal voter diversity scores, use reservoir sampling (via the existing rebalanceEnv RNG) to choose uniformly at random instead of deterministically preferring the lowest StoreID. This avoids systematically biasing placement toward low-numbered stores in symmetric clusters. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Rename `pickBestStoreByVoterDiversity` to `pickStoreByDiversity` with a `diversityScorer` function parameter. This allows the same picker to be used with both `getScoreChangeForNewReplica` (for additions) and `getScoreChangeForReplicaRemoval` (for removals). Update existing call sites in `repairAddVoter` and `promoteNonVoterToVoter`. Pure refactor, no behavior change. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add `repairRemoveVoter()` which handles over-replicated ranges by removing a voter. Candidate selection uses a health-based priority ordering (dead > unknown > unhealthy > shedding > refusing > healthy), taking the worst-health bucket first. Within that bucket, diversity-based tiebreaking picks the most redundant voter (least diversity loss on removal). The leaseholder is never considered for removal. Wire `RemoveVoter` into the `repair()` dispatch loop. Update the `repair_remove_voter.txt` test with the full lifecycle (repair, pending suppression, confirm, healthy). Add a new `repair_remove_voter_healthy.txt` test that verifies diversity-based selection when all stores are healthy. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

…irs from healthy Previously, `computeRepairAction` returned `NoRepairNeeded` for ranges with pending changes in flight, conflating "range is healthy" with "range needs repair but a change is already in flight." Add a new `RepairPending` enum value so these states are distinguishable, making it possible to observe how many ranges have outstanding repair actions. `RepairPending` ranges are excluded from the `repairRanges` index (same as `NoRepairNeeded`) so they are not acted on during repair, but they are surfaced in the `repair-needed` test command output via a separate scan. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

…on code Move `candidatesToConvertFromVoterToNonVoter` and `constraintsForAddingNonVoter` from `constraint_unused_test.go` to `constraint.go`. These methods are needed by the upcoming `AddNonVoter` repair action: the first finds voters that could be demoted to non-voter, and the second returns the constraint disjunction for placing a new non-voter. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

… tiers Add a `replicasLocalityTiers` parameter to `pickStoreByDiversity` so that non-voter operations can pass `replicaLocalityTiers` (all replicas) instead of the previously hardcoded `voterLocalityTiers` (voters only). This is needed because non-voter diversity should be scored against all replicas, not just voters. The three existing call sites (repairAddVoter, promoteNonVoterToVoter, repairRemoveVoter) are updated to explicitly pass `voterLocalityTiers`, preserving their existing behavior. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add `repairRemoveNonVoter`, which removes an over-replicated non-voter. Candidate selection follows the same priority as `repairRemoveVoter` (dead > unknown > unhealthy > shedding > refusing > healthy), but does not need to exclude the leaseholder since non-voters cannot hold leases. Within the worst-health bucket, the non-voter whose removal hurts diversity the least is chosen using `replicaLocalityTiers`. The test exercises the full lifecycle: detect over-replication, remove one non-voter, confirm via leaseholder message, verify healthy. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add `repairAddNonVoter`, which adds a non-voter to an under-replicated range. Like `repairAddVoter`, it first checks for a type-change shortcut: if there are extra voters that could be demoted to non-voter (via `candidatesToConvertFromVoterToNonVoter`), it uses `demoteVoterToNonVoter` to change the type in place. Otherwise, it finds a new store using the constraint disjunction from `constraintsForAddingNonVoter`, filters candidates, and picks by replica diversity. The `demoteVoterToNonVoter` helper mirrors `promoteNonVoterToVoter` but excludes the leaseholder and creates a VOTER_FULL -> NON_VOTER type change. The test exercises the full two-round lifecycle: add first non-voter, confirm, add second, confirm, verify healthy. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Move 7 constraint analysis methods from `constraint_unused_test.go` to `constraint.go`: - candidatesToConvertFromNonVoterToVoter - constraintsForAddingVoter - candidatesToConvertFromVoterToNonVoter - constraintsForAddingNonVoter - candidatesForRoleSwapForConstraints - candidatesVoterConstraintsUnsatisfied - candidatesNonVoterConstraintsUnsatisfied Pure mechanical move with improved doc comments from the prototype. These methods are prerequisites for the per-action repair functions in later PRs (AddVoter, RemoveVoter, constraint swaps). Informs cockroachdb#164658. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add the RepairAction enum and computeRepairAction() decision tree. These establish the action space and priority ordering for MMA repair. RepairAction has 15 values (12 actionable + 3 terminal states), ordered by priority via iota. computeRepairAction() maps range state to the highest-priority repair action needed, using a straightforward if/else cascade examining joint configs, quorum, replica counts, and constraint satisfaction. No callers yet — the wiring to clusterState comes in the next commit. Informs cockroachdb#164658. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Wire the repair action computation into clusterState so that each range's repair action is eagerly tracked and indexed. Structural changes: - Add `repairAction RepairAction` field to `rangeState` - Add `repairRanges map[RepairAction]map[RangeID]struct{}` to `clusterState` - Add `updateRepairAction()` and `removeFromRepairRanges()` to maintain the index Trigger points (where updateRepairAction is called): 1. End of processRangeMsg (replicas/config may have changed) 2. pendingChangeEnacted when all pending changes complete 3. End of undoPendingChange 4. End of addPendingRangeChange (sets RepairPending) 5. updateStoreStatuses when health/disposition changes (recomputes for all ranges on the affected store) Range GC calls removeFromRepairRanges before deleting the range. Test infrastructure: - `repair-needed` DSL command: iterates repairRanges by priority, prints action-to-ranges mapping; scans separately for RepairPending - `repair` DSL command: stub (pending changes only, no execution yet) - Parser: nextReplicaID auto-assignment, quiet=true on set-store, relaxed field count for replica lines, repair recomputation on update-store-status 6 new testdata files exercise the repair tracking across priority ordering, config changes, constraint changes, multi-range scenarios, pending change lifecycle, and store status transitions. Informs cockroachdb#164658. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add the RepairAction enum and computeRepairAction() decision tree. These establish the action space and priority ordering for MMA repair. RepairAction has 15 values (12 actionable + 3 terminal states), ordered by priority via iota. computeRepairAction() maps range state to the highest-priority repair action needed, using a straightforward if/else cascade examining joint configs, quorum, replica counts, and constraint satisfaction. No callers yet — the wiring to clusterState comes in the next commit. Comparison with legacy Allocator.ComputeAction (allocatorimpl/allocator.go): The legacy allocator has two separate orderings that sometimes disagree: 1. The Priority() ordering (used to rank ranges in the replicate queue): FinalizeAtomicReplicationChange 12002 RemoveLearner 12001 ReplaceDeadVoter 12000 AddVoter 10000 ReplaceDecommissioningVoter 5000 RemoveDeadVoter 1000 RemoveDecommissioningVoter 900 RemoveVoter 800 ReplaceDeadNonVoter 700 AddNonVoter 600 ReplaceDecommissioningNonVoter 500 RemoveDeadNonVoter 400 RemoveDecommissioningNonVoter 300 RemoveNonVoter 200 2. The computeAction() if/else cascade (used to pick which action to take for a single range): AddVoter ← checked before quorum! [quorum check → RangeUnavailable] ReplaceDeadVoter ReplaceDecommissioningVoter RemoveDeadVoter ← separate from ReplaceDeadVoter RemoveDecommissioningVoter ← separate from ReplaceDecomVoter RemoveVoter AddNonVoter ReplaceDeadNonVoter ReplaceDecommissioningNonVoter RemoveDeadNonVoter ← separate from ReplaceDeadNonVoter RemoveDecommissioningNonVoter ← separate from ReplaceDecomNonVoter RemoveNonVoter MMA's RepairAction unifies both orderings into a single iota sequence: FinalizeAtomicReplicationChange (1) RemoveLearner (2) AddVoter (3) ReplaceDeadVoter (4) ReplaceDecommissioningVoter (5) RemoveVoter (6) AddNonVoter (7) ReplaceDeadNonVoter (8) ReplaceDecommissioningNonVoter (9) RemoveNonVoter (10) SwapVoterForConstraints (11) ← new, legacy has no equivalent SwapNonVoterForConstraints (12) ← new, legacy has no equivalent RepairSkipped (13) RepairPending (14) NoRepairNeeded (15) Key differences from legacy: - Quorum check gates all actions: In the legacy code, AddVoter is checked before the quorum gate, meaning it can be attempted even without quorum (with a TODO noting this). MMA checks quorum first (step 4) and skips repair entirely if quorum is lost, since all replication changes require raft consensus. - No separate Remove{Dead,Decommissioning}{Voter,NonVoter}: The legacy code distinguishes "replace dead voter" (count matches, add-then-remove) from "remove dead voter" (over-replicated, just remove). MMA collapses these — RemoveVoter handles all over-replication cases, with candidate selection preferring dead > decommissioning > healthy replicas. - Constraint swaps are new: Legacy doesn't have repair actions for constraint violations — those are handled as rebalancing. MMA treats them as repair because a range with correct counts but wrong placement is not fully conformant. Informs cockroachdb#164658.

Wire the repair action computation into clusterState so that each range's repair action is eagerly tracked and indexed. Structural changes: - Add `repairAction RepairAction` field to `rangeState` - Add `repairRanges map[RepairAction]map[RangeID]struct{}` to `clusterState` - Add `updateRepairAction()` and `removeFromRepairRanges()` to maintain the index Trigger points (where updateRepairAction is called): 1. End of processRangeMsg (replicas/config may have changed) 2. pendingChangeEnacted when all pending changes complete 3. End of undoPendingChange 4. End of addPendingRangeChange (sets RepairPending) 5. updateStoreStatuses when health/disposition changes (recomputes for all ranges on the affected store) Range GC calls removeFromRepairRanges before deleting the range. Test infrastructure: - `repair-needed` DSL command: iterates repairRanges by priority, prints action-to-ranges mapping; scans separately for RepairPending - `repair` DSL command: stub (pending changes only, no execution yet) - Parser: nextReplicaID auto-assignment, quiet=true on set-store, relaxed field count for replica lines, repair recomputation on update-store-status 6 new testdata files exercise the repair tracking across priority ordering, config changes, constraint changes, multi-range scenarios, pending change lifecycle, and store status transitions. Informs cockroachdb#164658. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add the RepairAction enum and computeRepairAction() decision tree. These establish the action space and priority ordering for MMA repair. RepairAction has 15 values (12 actionable + 3 terminal states), ordered by priority via iota. computeRepairAction() maps range state to the highest-priority repair action needed, using a straightforward if/else cascade examining joint configs, quorum, replica counts, and constraint satisfaction. No callers yet — the wiring to clusterState comes in the next commit. Comparison with legacy Allocator.ComputeAction (allocatorimpl/allocator.go): The legacy allocator has two separate orderings that sometimes disagree: 1. The Priority() ordering (used to rank ranges in the replicate queue): FinalizeAtomicReplicationChange 12002 RemoveLearner 12001 ReplaceDeadVoter 12000 AddVoter 10000 ReplaceDecommissioningVoter 5000 RemoveDeadVoter 1000 RemoveDecommissioningVoter 900 RemoveVoter 800 ReplaceDeadNonVoter 700 AddNonVoter 600 ReplaceDecommissioningNonVoter 500 RemoveDeadNonVoter 400 RemoveDecommissioningNonVoter 300 RemoveNonVoter 200 2. The computeAction() if/else cascade (used to pick which action to take for a single range): AddVoter ← checked before quorum! [quorum check → RangeUnavailable] ReplaceDeadVoter ReplaceDecommissioningVoter RemoveDeadVoter ← separate from ReplaceDeadVoter RemoveDecommissioningVoter ← separate from ReplaceDecomVoter RemoveVoter AddNonVoter ReplaceDeadNonVoter ReplaceDecommissioningNonVoter RemoveDeadNonVoter ← separate from ReplaceDeadNonVoter RemoveDecommissioningNonVoter ← separate from ReplaceDecomNonVoter RemoveNonVoter MMA's RepairAction unifies both orderings into a single iota sequence: FinalizeAtomicReplicationChange (1) RemoveLearner (2) AddVoter (3) ReplaceDeadVoter (4) ReplaceDecommissioningVoter (5) RemoveVoter (6) AddNonVoter (7) ReplaceDeadNonVoter (8) ReplaceDecommissioningNonVoter (9) RemoveNonVoter (10) SwapVoterForConstraints (11) ← new, legacy has no equivalent SwapNonVoterForConstraints (12) ← new, legacy has no equivalent RepairSkipped (13) RepairPending (14) NoRepairNeeded (15) Key differences from legacy: - Quorum check gates all actions: In the legacy code, AddVoter is checked before the quorum gate, meaning it can be attempted even without quorum (with a TODO noting this). MMA checks quorum first (step 4) and skips repair entirely if quorum is lost, since all replication changes require raft consensus. - No separate Remove{Dead,Decommissioning}{Voter,NonVoter}: The legacy code distinguishes "replace dead voter" (count matches, add-then-remove) from "remove dead voter" (over-replicated, just remove). MMA collapses these — RemoveVoter handles all over-replication cases, with candidate selection preferring dead > decommissioning > healthy replicas. - Constraint swaps are new: Legacy doesn't have repair actions for constraint violations — those are handled as rebalancing. MMA treats them as repair because a range with correct counts but wrong placement is not fully conformant. Informs cockroachdb#164658.

Wire the repair action computation into clusterState so that each range's repair action is eagerly tracked and indexed. Structural changes: - Add `repairAction RepairAction` field to `rangeState` - Add `repairRanges map[RepairAction]map[RangeID]struct{}` to `clusterState` - Add `updateRepairAction()` and `removeFromRepairRanges()` to maintain the index Trigger points (where updateRepairAction is called): 1. End of processRangeMsg (replicas/config may have changed) 2. pendingChangeEnacted when all pending changes complete 3. End of undoPendingChange 4. End of addPendingRangeChange (sets RepairPending) 5. updateStoreStatuses when health/disposition changes (recomputes for all ranges on the affected store) Range GC calls removeFromRepairRanges before deleting the range. Test infrastructure: - `repair-needed` DSL command: iterates repairRanges by priority, prints action-to-ranges mapping; scans separately for RepairPending - `repair` DSL command: stub (pending changes only, no execution yet) - Parser: nextReplicaID auto-assignment, quiet=true on set-store, relaxed field count for replica lines, repair recomputation on update-store-status 6 new testdata files exercise the repair tracking across priority ordering, config changes, constraint changes, multi-range scenarios, pending change lifecycle, and store status transitions. Informs cockroachdb#164658. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>

Add the repair() method on rebalanceEnv — the main entry point for MMA repair. It iterates repairRanges in priority order, filters to ranges where the local store is the leaseholder, and dispatches to per-action repair functions. No repair actions are implemented yet (the switch default logs "not yet implemented"); AddVoter comes in the next commit. Wire repair into ComputeChanges via the IncludeRepair field on ChangeOptions. When set, repair() runs before rebalanceStores(), and its pending changes prevent the rebalancer from touching the same ranges. Add originMMARepair to the ChangeOrigin enum so that repair-originated changes can be tracked through AdjustPendingChangeDisposition. For now repair changes share the rebalance metric counters; dedicated repair metrics come in a follow-up PR. Add the "repair" DSL command to the test harness. It creates a rebalanceEnv with a deterministic random seed and calls repair(). Informs cockroachdb#164658. Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>