kvstorage: complete RaftReplicaID migration#95513
kvstorage: complete RaftReplicaID migration#95513craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
6d30d5d to
16d1a33
Compare
d3e9028 to
d2c0515
Compare
This comment was marked as resolved.
This comment was marked as resolved.
26f4edc to
18a7ff7
Compare
|
Now it's good to review. |
sumeerbhola
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @pavelkalinnikov and @tbg)
-- commits line 36 at r4:
wasn't it v22.1?
There is also a similar comment in
cockroach/pkg/kv/kvserver/store_create_replica.go
Lines 249 to 283 in c9f87cf
tbg
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @pavelkalinnikov and @sumeerbhola)
wasn't it v22.1?
Yep, thanks.
Updated the comment.
18a7ff7 to
9731f35
Compare
5327822 to
749969b
Compare
|
Ok, RFAL. |
|
|
@RaduBerinde requested that we file these concurrency issues inside pebble as issues on the pebble repo. |
|
Filed cockroachdb/pebble#2301 |
See cockroachdb#93310. This is also the beginning of cockroachdb#93247. Epic: CRDB-220 Release note: None
749969b to
20685eb
Compare
pav-kv
left a comment
There was a problem hiding this comment.
Looking much clearer now, thanks for implementing the suggestion! LGTM in principle, but left some tips/nits.
|
Still have to push the updates, please ignore my comment pings until I request. |
20685eb to
b0cf469
Compare
pav-kv
left a comment
There was a problem hiding this comment.
Some final bits and bobs.
As of v22.1[^1], we always write the RaftReplicaID when creating a Replica or updating a snapshot. However, since this is persisted state that could've originated in older versions and not updated yet, we couldn't rely on a persisted ReplicaID yet. This commit adds code to the `(*Store).Start` boot sequence that - persists a RaftReplicaID for all initialized replicas (using the ReplicaID from the descriptor) - deletes all uninitialized replicas that don't have a RaftReplicaID (since we don't know their ReplicaID at this point). The second item in theory violates Raft invariants, as uninitialized Replicas are allowed to vote (though they then cannot accept log entries). So in theory: - an uninitialized replica casts a decisive vote for a leader - it restarts - code in this commit removes the uninited replica (and its vote) - delayed MsgVote from another leader arrives - it casts another vote for the same term for a dueling leader - now there are two leaders in the same term. The above in addition presupposes that the two leaders cannot communicate with each other. Also, even if that is the case, since the two leaders cannot append to the uninitialized replica (it doesn't accept entries), we also need additional voters to return at the exact right time. Since an uninitialized replica without RaftReplicaID in is necessarily at least one release old, this is exceedingly unlikely and we will live with this theoretical risk. This commit also introduces a few assertions that make sure that we don't have overlapping initialized replicas (which would be detected at Store.Start time otherwise while inserting in the btree, but it's nice to catch this earlier) or duplicate RangeIDs. [^1]: cockroachdb#75761 Epic: CRDB-220 Release note: None
402bc02 to
672e8b1
Compare
|
bors r=pavelkalinnikov |
|
Build succeeded: |
It now has to be there, so turn this into an assertion failure. See cockroachdb#95513. Epic: CRDB-220 Release note: None
It now has to be there, so turn this into an assertion failure. See cockroachdb#95513. Epic: CRDB-220 Release note: None
115884: kvserver: remove ReplicaID migration r=erikgrinaker a=erikgrinaker This migration ensured every replica had a persisted replica ID. It is no longer needed after `MinSupportedVersion` >= 23.1, since the migration has been applied on every finalized 23.1 node. Instead, we assert during startup that all replicas have a replica ID. Resolves #115869. Touches #95513. Epic: none Release note: None 116443: roachtest: port multitenant/shared-process/basic to new APIs r=srosenberg a=herkolategan Converts multitenant/shared-process/basic to use the new roachprod multitenant APIs. Fixes: #115868 Epic: CRDB-31933 Release Note: None Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com> Co-authored-by: Herko Lategan <herko@cockroachlabs.com>
150189: kvstorage: fix CreateUninitializedReplica comment r=arulajmani a=pav-kv The scenario described in the comment was possible before it became an invariant that all replicas have a `RaftReplicaID` in storage. Since we completed a migration [#95513] which established this invariant, we can remove this comment. Historical context: https://github.com/cockroachdb/cockroach/pull/75761/files#diff-208036c53fab5274d107185f7b974421738b6cd54b849000a60656c764ba2b08R224-R272 Epic: none Release note: none Co-authored-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
As of v22.11, we always write the RaftReplicaID when creating a
Replica or updating a snapshot. However, since this is
persisted state that could've originated in older versions and not
updated yet, we couldn't rely on a persisted ReplicaID yet.
This commit adds code to the
(*Store).Startboot sequence thatReplicaID from the descriptor)
ReplicaID at this point).
The second item in theory violates Raft invariants, as uninitialized
Replicas are allowed to vote (though they then cannot accept log
entries). So in theory:
The above in addition presupposes that the two leaders cannot
communicate with each other. Also, even if that is the case, since the
two leaders cannot append to the uninitialized replica (it doesn't
accept entries), we also need additional voters to return at the exact
right time.
Since an uninitialized replica without RaftReplicaID in is necessarily
at least one release old, this is exceedingly unlikely and we will
live with this theoretical risk.
This PR also adds a first stab at a datadriven test harness for
kvstoragewhich is likely to be of use for #93247.Epic: CRDB-220
Release note: None
Footnotes
https://github.com/cockroachdb/cockroach/pull/75761 ↩