-
Notifications
You must be signed in to change notification settings - Fork 4.1k
kvserver: allow quiescing with paused followers #84252
Description
Is your feature request related to a problem? Please describe.
PR #83851 introduces the CRDB-level concept of "paused" followers, which are followers we're intentionally not replicating to. It mirrors the raft concept of the same name (followers raft is only sending to at a very low rate, for example during probing or when follower doesn't respond to MsgApp).
At the time of writing, a leaseholder with a paused follower will not quiesce.
Describe the solution you'd like
Ranges should be able to quiesce, ignoring paused followers. When a store unpauses, all quiesced replicas that had this follower as paused at the time of quiesce should unquiesce (to ensure the follower is promptly caught up).
This is very similar to what we already do for liveness, see
cockroach/pkg/kv/kvserver/replica_raft_quiesce.go
Lines 205 to 211 in 571bfa3
| // laggingReplicaSet is a set containing liveness information about replicas | |
| // that were dead when a Raft leader decided to quiesce its range and were | |
| // lagging behind the quiescence log index (meaning they would have prevented | |
| // quiescence had they been alive). If any replica (leader or follower) becomes | |
| // aware that a replica in this set has become live, it should unquiesce the | |
| // range so that the replica can be caught back up. | |
| type laggingReplicaSet []livenesspb.Liveness |
Describe alternatives you've considered
Additional context
Jira issue: CRDB-18409