storage: write a tombstone with math.MaxInt32 when processing a merge#40814
Conversation
|
@benesch any interest in chiming in on this one? Am I missing anything? It seems like if we're at this critical phase of a merge we know the RHS has its terminal replica ID. |
nvb
left a comment
There was a problem hiding this comment.
but let's wait for @benesch to chime in.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)
benesch
left a comment
There was a problem hiding this comment.
It's been a while (holy cow, 9 months!) since I've really thought about range merges, so I'm not 1000% sure that this is correct. But it certainly seems legit, and it's not like there was a particular reason that the code looked like this. Rather the GC queue using MaxInt was quick fix when I realized I didn't have NextReplicaID available there, and I didn't think to propagate that change back to the normal subsumption code path.
Reviewed 2 of 2 files at r1.
Reviewable status:complete! 2 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/storage/replica_application_state_machine.go, line 571 at r1 (raw file):
const mustClearRange = false if err := rhsRepl.preDestroyRaftMuLocked( ctx, b.batch, b.batch, math.MaxInt32, rangeIDLocalOnly, mustClearRange,
Not that it really matters much to me, but it seems like it might be worth leaving a comment here along these lines:
// Use math.MaxInt32 as the nextReplicaID as an extra safeguard against creating
// new replicas of the RHS. This isn't required for correctness, since the merge
// protocol should guarantee that no new replicas of the RHS can ever be
// created, but it doesn't hurt to be careful.
The comment in the GC queue (
cockroach/pkg/storage/replica_gc_queue.go
Lines 331 to 337 in ad4d6f5
Before this PR we would write such a tombstone when detecting that a range had been merged via a snapshot or via the replica gc queue but curiously not when merging the range by applying a merge. Release Justification: Came across this oddity while working on updating tests for cockroachdb#40751. Release note: None
bee452a to
264c71d
Compare
ajwerner
left a comment
There was a problem hiding this comment.
TFTR!
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @benesch)
pkg/storage/replica_application_state_machine.go, line 571 at r1 (raw file):
Previously, benesch (Nikhil Benesch) wrote…
Not that it really matters much to me, but it seems like it might be worth leaving a comment here along these lines:
// Use math.MaxInt32 as the nextReplicaID as an extra safeguard against creating // new replicas of the RHS. This isn't required for correctness, since the merge // protocol should guarantee that no new replicas of the RHS can ever be // created, but it doesn't hurt to be careful.The comment in the GC queue (
) should also probably be updated to read "a replica ID of max int is what we do on merges; see runPreApplyTriggers for details" rather than "oh shoot we don't have the NextReplicaID, lol, guess we can just stick maxint here."cockroach/pkg/storage/replica_gc_queue.go
Lines 331 to 337 in ad4d6f5
Done.
Build failed |
ajwerner
left a comment
There was a problem hiding this comment.
zerosum-splits flake.
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @benesch)
40814: storage: write a tombstone with math.MaxInt32 when processing a merge r=ajwerner a=ajwerner Before this PR we would write such a tombstone when detecting that a range had been merged via a snapshot or via the replica gc queue but curiously not when merging the range by applying a merge. Release Justification: Came across this oddity while working on updating tests for #40751. Maybe is not justified. Release note: None Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>
Build succeeded |
Before this PR we would write such a tombstone when detecting that a range had
been merged via a snapshot or via the replica gc queue but curiously not when
merging the range by applying a merge.
Release Justification: Came across this oddity while working on updating tests
for #40751. Maybe is not justified.
Release note: None