storage: make queues replica ID aware#40889
Conversation
bdarnell
left a comment
There was a problem hiding this comment.
Reviewed 6 of 6 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @nvanbenschoten)
pkg/storage/queue.go, line 94 at r1 (raw file):
// sameReplica returns true if the passed replicaID matches the item's replica // ID or the item's replica ID is zero.
This special treatment of replica ID zero (but only in one direction) seems like something that might bite us in the future. Is this always going to be what we want? I think it might be safer to treat the transition from preemptive snapshot to full replica as something that also invalidates any previous enqueuing of the replica.
pkg/storage/queue_test.go, line 1137 at r1 (raw file):
bq.maybeAdd(ctx, r, tc.store.Clock().Now()) r.replicaID = 2 require.Nil(t, bq.pop())
Also test that the mismatched replica is removed from the queue (so you don't get an endless stream of nils while the head of the queue has a wrong replica id)
a047c70 to
5c9ea3b
Compare
nvb
left a comment
There was a problem hiding this comment.
Reviewed 3 of 6 files at r1, 2 of 2 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/storage/queue_concurrency_test.go, line 148 at r2 (raw file):
type fakeReplica struct { id roachpb.RangeID
nit: s/id/rangeID/
pkg/storage/queue_test.go, line 1105 at r2 (raw file):
// TestBaseQueueReplicaChange ensures that if a replica is added to the queue // with a non-zero replica ID then it is only popped if the retrieved replica
s/popped/processed/
ba02a45 to
9b73cc3
Compare
This is the first PR in the series to fix the fallout from learner replicas due to replicas being added and removed without synchronously receiving snapshots. Prior to the upcoming changes a replica object may change its ID. Release Justification: Part of fixing a major release blocker. Release note: None
9b73cc3 to
301125e
Compare
ajwerner
left a comment
There was a problem hiding this comment.
TFTR!
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @bdarnell and @nvanbenschoten)
pkg/storage/queue.go, line 94 at r1 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
This special treatment of replica ID zero (but only in one direction) seems like something that might bite us in the future. Is this always going to be what we want? I think it might be safer to treat the transition from preemptive snapshot to full replica as something that also invalidates any previous enqueuing of the replica.
Done.
pkg/storage/queue_test.go, line 1137 at r1 (raw file):
Previously, bdarnell (Ben Darnell) wrote…
Also test that the mismatched replica is removed from the queue (so you don't get an endless stream of nils while the head of the queue has a wrong replica id)
Done.
40889: storage: make queues replica ID aware r=ajwerner a=ajwerner This is the first PR in the series to fix the fallout from learner replicas due to replicas being added and removed without synchronously receiving snapshots. Prior to the upcoming changes a replica object may change its ID. Release Justification: Part of fixing a major release blocker. Release note: None Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>
Build succeeded |
40892: storage: ensure Replica objects never change replicaID r=ajwerner a=ajwerner We've seen instability recently due to invariants being violated as replicas catch up across periods of being removed and re-added to a range. Due to learner replicas and their rollback behavior this is now a relatively common case. Rather than handle all of these various scenarios this PR prevents them from occuring by actively removing replicas when we determine that they must have been removed. Here's a high level overview of the change: * Once a Replica object has a non-zero Replica.mu.replicaID it will not change. * If a raft message or snapshot addressed to a higher replica ID is received the current replica will be removed completely. * If a replica sees a ChangeReplicasTrigger which removes it then it completely removes itself while applying that command. * Replica.mu.destroyStatus is used to meaningfully signify the removal state of a Replica. Replicas about to be synchronously removed are in destroyReasonRemovalPending. This hopefully gives us some new invariants: * There is only ever at most 1 *Replica which IsAlive() for a range on a store at a time. * Once a *Replica has a non-zero ReplicaID is never changes. The change also introduces some new complexity. Namely we now allow removal of uninitialized replicas, including their hard state. This allows us to catch up across a split even when we know the RHS must have been removed. Fixes #40367. The first commit is #40889. Release justification: This commit is safe for 19.2 because it fixes release blockers. Release note (bug fix): Fix crashes by preventing replica ID change. Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>
This is the first PR in the series to fix the fallout from learner replicas
due to replicas being added and removed without synchronously receiving
snapshots. Prior to the upcoming changes a replica object may change its ID.
Release Justification: Part of fixing a major release blocker.
Release note: None