Skip to content

storage: write a tombstone with math.MaxInt32 when processing a merge#40814

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
ajwerner:ajwerner/write-max-int-tombstone-when-applying-a-merge
Sep 20, 2019
Merged

storage: write a tombstone with math.MaxInt32 when processing a merge#40814
craig[bot] merged 1 commit intocockroachdb:masterfrom
ajwerner:ajwerner/write-max-int-tombstone-when-applying-a-merge

Conversation

@ajwerner
Copy link
Copy Markdown
Contributor

Before this PR we would write such a tombstone when detecting that a range had
been merged via a snapshot or via the replica gc queue but curiously not when
merging the range by applying a merge.

Release Justification: Came across this oddity while working on updating tests
for #40751. Maybe is not justified.

Release note: None

@ajwerner ajwerner requested a review from nvb September 16, 2019 22:03
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@ajwerner
Copy link
Copy Markdown
Contributor Author

@benesch any interest in chiming in on this one? Am I missing anything? It seems like if we're at this critical phase of a merge we know the RHS has its terminal replica ID.

Copy link
Copy Markdown
Contributor

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: but let's wait for @benesch to chime in.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)

Copy link
Copy Markdown
Contributor

@benesch benesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

It's been a while (holy cow, 9 months!) since I've really thought about range merges, so I'm not 1000% sure that this is correct. But it certainly seems legit, and it's not like there was a particular reason that the code looked like this. Rather the GC queue using MaxInt was quick fix when I realized I didn't have NextReplicaID available there, and I didn't think to propagate that change back to the normal subsumption code path.

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/storage/replica_application_state_machine.go, line 571 at r1 (raw file):

		const mustClearRange = false
		if err := rhsRepl.preDestroyRaftMuLocked(
			ctx, b.batch, b.batch, math.MaxInt32, rangeIDLocalOnly, mustClearRange,

Not that it really matters much to me, but it seems like it might be worth leaving a comment here along these lines:

// Use math.MaxInt32 as the nextReplicaID as an extra safeguard against creating
// new replicas of the RHS. This isn't required for correctness, since the merge
// protocol should guarantee that no new replicas of the RHS can ever be
// created, but it doesn't hurt to be careful.

The comment in the GC queue (

// We don't have the last NextReplicaID for the subsumed range, nor can we
// obtain it, but that's OK: we can just be conservative and use the maximum
// possible replica ID. store.RemoveReplica will write a tombstone using
// this maximum possible replica ID, which would normally be problematic, as
// it would prevent this store from ever having a new replica of the removed
// range. In this case, however, it's copacetic, as subsumed ranges _can't_
// have new replicas.
) should also probably be updated to read "a replica ID of max int is what we do on merges; see runPreApplyTriggers for details" rather than "oh shoot we don't have the NextReplicaID, lol, guess we can just stick maxint here."

Before this PR we would write such a tombstone when detecting that a range had
been merged via a snapshot or via the replica gc queue but curiously not when
merging the range by applying a merge.

Release Justification: Came across this oddity while working on updating tests
for cockroachdb#40751.

Release note: None
@ajwerner ajwerner force-pushed the ajwerner/write-max-int-tombstone-when-applying-a-merge branch from bee452a to 264c71d Compare September 20, 2019 12:48
Copy link
Copy Markdown
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @benesch)


pkg/storage/replica_application_state_machine.go, line 571 at r1 (raw file):

Previously, benesch (Nikhil Benesch) wrote…

Not that it really matters much to me, but it seems like it might be worth leaving a comment here along these lines:

// Use math.MaxInt32 as the nextReplicaID as an extra safeguard against creating
// new replicas of the RHS. This isn't required for correctness, since the merge
// protocol should guarantee that no new replicas of the RHS can ever be
// created, but it doesn't hurt to be careful.

The comment in the GC queue (

// We don't have the last NextReplicaID for the subsumed range, nor can we
// obtain it, but that's OK: we can just be conservative and use the maximum
// possible replica ID. store.RemoveReplica will write a tombstone using
// this maximum possible replica ID, which would normally be problematic, as
// it would prevent this store from ever having a new replica of the removed
// range. In this case, however, it's copacetic, as subsumed ranges _can't_
// have new replicas.
) should also probably be updated to read "a replica ID of max int is what we do on merges; see runPreApplyTriggers for details" rather than "oh shoot we don't have the NextReplicaID, lol, guess we can just stick maxint here."

Done.

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Sep 20, 2019

Build failed

Copy link
Copy Markdown
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

zerosum-splits flake.

bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @benesch)

craig bot pushed a commit that referenced this pull request Sep 20, 2019
40814: storage: write a tombstone with math.MaxInt32 when processing a merge r=ajwerner a=ajwerner

Before this PR we would write such a tombstone when detecting that a range had
been merged via a snapshot or via the replica gc queue but curiously not when
merging the range by applying a merge.

Release Justification: Came across this oddity while working on updating tests
for #40751. Maybe is not justified.

Release note: None

Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Sep 20, 2019

Build succeeded

@craig craig bot merged commit 264c71d into cockroachdb:master Sep 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants