eagerly write Raft state for new Ranges#7310
Conversation
|
Generally looks good when the tests are sorted out. Review status: 0 of 8 files reviewed at latest revision, 1 unresolved discussion, all commit checks successful. storage/replica_state.go, line 326 [r6] (raw file):
We can't just clobber the HardState: we could have already cast a vote that must be preserved. This can happen in the TestSplitSnapshotRace_SnapshotWins scenario: this replica is behind and the other nodes have already started an election (or several, so the term could be higher too). We won't know anything else about the range because we can't apply a snapshot in this state, but the HardState is still valid. We need to load the existing HardState and only set Term and Commit if the previous value was zero. Comments from Reviewable |
|
Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 1 of 1 files at r3, 1 of 1 files at r4, 6 of 6 files at r5, 1 of 3 files at r6. storage/replica.go, line 825 [r6] (raw file):
does proto.Equal work? storage/replica_command.go, line 2188 [r5] (raw file):
hello storage/replica_state.go, line 66 [r3] (raw file):
nit: doesn't pass golint storage/replica_state.go, line 80 [r5] (raw file):
this looks like it's updating the stats in place, which is odd given that it is eventually returned. the comment on the function is not exactly clear about whether this is intended, but it seems sketchy to do both (update in place and return). storage/replica_state.go, line 203 [r5] (raw file):
fair amount of random code movement in this commit, hard to tell what's what Comments from Reviewable |
e7d375b to
a42f482
Compare
|
All green (at least should be), PTAL. Review status: 0 of 13 files reviewed at latest revision, 6 unresolved discussions, some commit checks pending. storage/replica.go, line 825 [r6] (raw file):
|
|
Reviewed 5 of 11 files at r11, 1 of 3 files at r15, 7 of 7 files at r16. storage/replica.go, line 1946 [r16] (raw file):
be an assertion storage/replica_raftstorage.go, line 59 [r16] (raw file):
this conflates empty on-disk HS with missing HS :( storage/replica_state.go, line 66 [r3] (raw file):
|
2cf0d9c to
f379bcf
Compare
|
Review status: 0 of 13 files reviewed at latest revision, 7 unresolved discussions, some commit checks pending. storage/replica.go, line 1946 [r16] (raw file):
|
|
Reviewed 5 of 11 files at r21, 1 of 3 files at r22, 7 of 7 files at r23. keys/keys.go, line 164 [r23] (raw file):
Why did you rename these? The applied indexes are per-Replica, not per-range (although they are oddly counted as "replicated" keys). RangeTruncatedStateKey makes a little more sense because it is range-level state, but I don't think it makes sense to rename the key without renaming the proto. keys/printer_test.go, line 55 [r23] (raw file):
The pretty-printed names of these keys should match the function names. storage/replica.go, line 1818 [r23] (raw file):
Add a TODO to remove this, or do you expect to leave it in indefinitely? storage/replica_raftstorage.go, line 59 [r16] (raw file):
|
|
Reviewed 5 of 11 files at r21, 1 of 3 files at r22, 7 of 7 files at r23. storage/replica_raftstorage.go, line 59 [r16] (raw file):
|
|
Review status: all files reviewed at latest revision, 10 unresolved discussions, all commit checks successful. storage/replica_raftstorage.go, line 59 [r16] (raw file):
|
|
Review status: all files reviewed at latest revision, 10 unresolved discussions, all commit checks successful. storage/replica_raftstorage.go, line 59 [r16] (raw file):
|
|
Review status: all files reviewed at latest revision, 10 unresolved discussions, all commit checks successful. storage/replica_raftstorage.go, line 59 [r16] (raw file):
|
|
TFTRs! Review status: all files reviewed at latest revision, 10 unresolved discussions, all commit checks successful. keys/keys.go, line 164 [r23] (raw file):
|
|
Review status: 0 of 10 files reviewed at latest revision, 4 unresolved discussions, some commit checks pending. storage/replica_state.go, line 333 [r23] (raw file):
|
This was not done previously. The handling of truncated state is still a bit sketchy but will cleaned up further.
Seems like we could phase it out if it was worth the trouble.
Change the initialization of Raft groups so that whenever we create a Replica for a "new" Raft group (as opposed to creating a Replica that will receive a snapshot), we write an initial ReplicaState and associated Raft state.
|
Review status: 0 of 10 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. storage/replica_state.go, line 333 [r23] (raw file):
|
|
LGTM Review status: 0 of 10 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending. Comments from Reviewable |
|
Reviewed 3 of 13 files at r24, 7 of 10 files at r34, 3 of 3 files at r35. storage/replica_state.go, line 32 [r35] (raw file):
pedantic nit: disk? Comments from Reviewable |
|
Review status: all files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. storage/replica_state.go, line 32 [r35] (raw file):
|
|
Review status: all files reviewed at latest revision, 4 unresolved discussions, some commit checks pending. storage/replica_raftstorage.go, line 59 [r16] (raw file):
|
|
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks pending. storage/replica_raftstorage.go, line 59 [r16] (raw file):
isn't that assertable? Comments from Reviewable |
|
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks pending. storage/replica_raftstorage.go, line 59 [r16] (raw file):
|
|
Review status: all files reviewed at latest revision, 3 unresolved discussions, some commit checks pending. storage/replica_raftstorage.go, line 59 [r16] (raw file):
|
In cockroachdb#7310, we added (and enforced) assumptions on what was persisted to disk, but did not ensure that older versions were migrated properly. This in turn lead to cockroachdb#7389. A migration is only necessary for ranges which had never been truncated, and so it should be fairly rare in practice. The migration simply writes the truncated state that would otherwise been synthesized by the old code, which is technically a consistency violation, but should work well in practice because all nodes will write that state on startup when they are rebooted.
In cockroachdb#7310, we added (and enforced) assumptions on what was persisted to disk, but did not ensure that older versions were migrated properly. This in turn lead to cockroachdb#7389. A migration is only necessary for ranges which had never been truncated, and so it should be fairly rare in practice. The migration simply writes the truncated state that would otherwise been synthesized by the old code, which is technically a consistency violation, but should work well in practice because all nodes will write that state on startup when they are rebooted.
Also straightened out some kinks with the existing reference tests, namely that they were creating two single-node clusters per test which were never used. Instead, we can now create zero-node clusters and run one-shot containers on it (one would argue that that's still weird, but it seems acceptable for the time being). The newly added reference test will crash when run for commits between cockroachdb#7310 and cockroachdb#7429. See cockroachdb/postgres-test#13 for the initial state used in the test.
Also straightened out some kinks with the existing reference tests, namely that they were creating two single-node clusters per test which were never used. Instead, we can now create zero-node clusters and run one-shot containers on it (one would argue that that's still weird, but it seems acceptable for the time being). The newly added reference test will crash when run for commits between cockroachdb#7310 and cockroachdb#7429. See cockroachdb/postgres-test#13 for the initial state used in the test.
Also straightened out some kinks with the existing reference tests, namely that they were creating two single-node clusters per test which were never used. Instead, we can now create zero-node clusters and run one-shot containers on it (one would argue that that's still weird, but it seems acceptable for the time being). The newly added reference test will crash when run for commits between cockroachdb#7310 and cockroachdb#7429. See cockroachdb/postgres-test#13 for the initial state used in the test.
Also straightened out some kinks with the existing reference tests, namely that they were creating two single-node clusters per test which were never used. Instead, we can now create zero-node clusters and run one-shot containers on it (one would argue that that's still weird, but it seems acceptable for the time being). The newly added reference test will crash when run for commits between cockroachdb#7310 and cockroachdb#7429. See cockroachdb/postgres-test#13 for the initial state used in the test.
In cockroachdb#7310, we added (and enforced) assumptions on what was persisted to disk, but did not ensure that older versions were migrated properly. This in turn lead to cockroachdb#7389. A migration is only necessary for ranges which had never been truncated, and so it should be fairly rare in practice. The migration simply writes the truncated state that would otherwise been synthesized by the old code, which is technically a consistency violation, but should work well in practice because all nodes will write that state on startup when they are rebooted.
Also straightened out some kinks with the existing reference tests, namely that they were creating two single-node clusters per test which were never used. Instead, we can now create zero-node clusters and run one-shot containers on it (one would argue that that's still weird, but it seems acceptable for the time being). The newly added reference test will crash when run for commits between cockroachdb#7310 and cockroachdb#7429. See cockroachdb/postgres-test#13 for the initial state used in the test.
In cockroachdb#7310, we added (and enforced) assumptions on what was persisted to disk, but did not ensure that older versions were migrated properly. This in turn lead to cockroachdb#7389. A migration is only necessary for ranges which had never been truncated, and so it should be fairly rare in practice. The migration simply writes the truncated state that would otherwise been synthesized by the old code, which is technically a consistency violation, but should work well in practice because all nodes will write that state on startup when they are rebooted.
Also straightened out some kinks with the existing reference tests, namely that they were creating two single-node clusters per test which were never used. Instead, we can now create zero-node clusters and run one-shot containers on it (one would argue that that's still weird, but it seems acceptable for the time being). The newly added reference test will crash when run for commits between cockroachdb#7310 and cockroachdb#7429. See cockroachdb/postgres-test#13 for the initial state used in the test.
Change the initialization of Raft groups so that whenever we create a Replica
for a "new" Raft group (as opposed to creating a Replica that will receive
a snapshot), we write an initial ReplicaState and associated Raft state.
The WIP part is in the last commit which disables some tests (all of which, I think,
create replicas out-of-band and thus don't properly initialize the state). Some of
them may actually pass, but they shouldn't (I flagged them while I still had an
assertion in there which I'm planning to put in again).
We're still updating in-mem and on-disk state kind of ad-hoc, but that's for a future
PR.
This change is