kvserver: transfer lease in JOINT config directly to a VOTER_INCOMING…#77841
Conversation
e6e4b58 to
a29000a
Compare
a29000a to
2e098a8
Compare
nvb
left a comment
There was a problem hiding this comment.
Reviewed 9 of 9 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @shralex)
pkg/kv/kvserver/replica_command.go, line 1229 at r1 (raw file):
if voterIncomingTarget == (roachpb.ReplicaDescriptor{}) { err := errors.Errorf( "no VOTER_INCOMING to transfer lease to for r%d (should not happen)", desc.RangeID)
I'm hopeful that we never end up in this situation, but if we do, what's the best recourse? Killing the leaseholder node to force the lease to move might be one approach.
The other might be ALTER RANGE <rangeid> RELOCATE LEASE TO <another_replicas_store_id>. I think it would be worthwhile to intentionally create this situation manually (e.g. in a local roachprod cluster with broken code in the replicate queue) and then confirm that the ALTER TABLE command works to escape it. Better to do that now than during a P1 outage.
If that works, then let's put the exact command to run to escape in the error message. Also, we should probably be logging a much louder error. Either log.Warningf or log.Errorf.
pkg/kv/kvserver/replica_command.go, line 2883 at r1 (raw file):
} } else if len(ops) == 0 { if len(voterTargets) > 0 && transferLeaseToFirstVoter {
Is this dead code? Doesn't len(ops) == 0 imply !containsVoterIncoming and therefore !lhRemovalAllowed?
pkg/roachpb/metadata.go, line 331 at r1 (raw file):
that is different from the provided leaseHolderID
This part looks stale.
pkg/kv/kvserver/client_replica_test.go, line 1854 at r1 (raw file):
} func TestRemoveLeaseholder(t *testing.T) {
We have sufficient test coverage of the lateral transfer case even without this test, right?
2e098a8 to
16d27da
Compare
shralex
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @nvanbenschoten, and @shralex)
pkg/kv/kvserver/replica_command.go, line 1229 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm hopeful that we never end up in this situation, but if we do, what's the best recourse? Killing the leaseholder node to force the lease to move might be one approach.
The other might be
ALTER RANGE <rangeid> RELOCATE LEASE TO <another_replicas_store_id>. I think it would be worthwhile to intentionally create this situation manually (e.g. in a local roachprod cluster with broken code in the replicate queue) and then confirm that theALTER TABLEcommand works to escape it. Better to do that now than during a P1 outage.If that works, then let's put the exact command to run to escape in the error message. Also, we should probably be logging a much louder error. Either
log.Warningforlog.Errorf.
It would be nice not to rely on manual intervention. Added a log.fatal
Code quote:
"no VOTER_INCOMING to transfer lease to for r%d (should not happen)pkg/kv/kvserver/replica_command.go, line 2883 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is this dead code? Doesn't
len(ops) == 0imply!containsVoterIncomingand therefore!lhRemovalAllowed?
good catch, removed
pkg/kv/kvserver/client_replica_test.go, line 1854 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We have sufficient test coverage of the lateral transfer case even without this test, right?
I think so, for example TestLeaseholderRelocate, TestAdminRelocateRange
nvb
left a comment
There was a problem hiding this comment.
once the tests are stabilized and CI is green.
Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15 and @shralex)
pkg/kv/kvserver/replica_command.go, line 1229 at r1 (raw file):
Previously, shralex wrote…
It would be nice not to rely on manual intervention. Added a log.fatal
Is there anything more we'd like in this error message to help us diagnose any bugs in this area? Maybe the full range descriptor?
16d27da to
f901d4c
Compare
aayushshah15
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten and @shralex)
pkg/kv/kvserver/replica_command.go, line 1231 at r2 (raw file):
// JOINT config if the leaseholder is being removed, when there is a VOTER_INCOMING replica. // Killing the leaseholder to force lease transfer. log.Fatalf(ctx, "no VOTER_INCOMING to transfer lease to for r%d (should not happen)", desc.RangeID)
nit: since it's a Fatal now, let's remove the "should not happen" bit from the log message.
pkg/kv/kvserver/replica_command.go, line 2865 at r2 (raw file):
r.store.cfg.Settings.Version.IsActive(ctx, clusterversion.EnableLeaseHolderRemoval) if !lhRemovalAllowed {
Isn't this diff/chunk redundant since relocateOne already checks the cluster setting?
pkg/kv/kvserver/replica_command.go, line 2888 at r2 (raw file):
return rangeDesc, ctx.Err() } } else if len(ops) == 0 {
I'm surprised that we don't need this anymore. Don't we need this to uphold AdminRelocateRange's contract?
pkg/kv/kvserver/replica_raft.go, line 373 at r2 (raw file):
clusterversion.EnableLeaseHolderRemoval) // Previously, we were not allowed to enter a joint config where the // leaseholder is being removed (i.e., not a voter). In the new version
Does the "(not a voter)" bit seem correct here?
pkg/kv/kvserver/client_replica_test.go, line 1854 at r2 (raw file):
} func TestRemoveLeaseholder(t *testing.T) {
Instead of removing the test, could we just change it to assert that removing the leaseholder (without a replacement) fails the test in the way that we expect (i.e. perhaps we could revert the test to what it was before #74077?)
f901d4c to
ed5326d
Compare
shralex
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15 and @nvanbenschoten)
pkg/kv/kvserver/replica_command.go, line 1229 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Is there anything more we'd like in this error message to help us diagnose any bugs in this area? Maybe the full range descriptor?
Done
pkg/kv/kvserver/replica_command.go, line 1231 at r2 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
nit: since it's a
Fatalnow, let's remove the "should not happen" bit from the log message.
Done
pkg/kv/kvserver/replica_command.go, line 2865 at r2 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
Isn't this diff/chunk redundant since
relocateOnealready checks the cluster setting?
Right, I now changed this so that lhRemovalAllowed is returned by relocateOne
pkg/kv/kvserver/replica_command.go, line 2888 at r2 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
I'm surprised that we don't need this anymore. Don't we need this to uphold
AdminRelocateRange's contract?
Nathan observed that len(ops) == 0 will never be true if lhRemovalAllowed = true, because in this case there's an incoming voter being added, so len(ops) > 0
pkg/kv/kvserver/replica_raft.go, line 373 at r2 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
Does the "(not a voter)" bit seem correct here?
Changed to "full voter", but otherwise seems ok...
Code quote:
/ Previously, we were not allowed to enter a joint config where thepkg/kv/kvserver/client_replica_test.go, line 1854 at r2 (raw file):
Previously, aayushshah15 (Aayush Shah) wrote…
Instead of removing the test, could we just change it to assert that removing the leaseholder (without a replacement) fails the test in the way that we expect (i.e. perhaps we could revert the test to what it was before #74077?)
I believe it's a new test added in #74077. In any case, I added it back -- RemoveLeaseHolderOrFatal removes the lease prior to removing the leaseholder and the test passes. Another test (TestLeaseHolderRemoveSelf) tests the error condition.
ed5326d to
3b4cbf2
Compare
nvb
left a comment
There was a problem hiding this comment.
Reviewed 2 of 3 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15)
pkg/kv/kvserver/replica_command.go, line 1229 at r1 (raw file):
Previously, shralex wrote…
Done
nit: I don't think we need the for r%d part. This should already be printed in the log tags (passed through the annotated context) and it will be part of the descriptor.
pkg/kv/kvserver/replica_command.go, line 2865 at r2 (raw file):
Previously, shralex wrote…
Right, I now changed this so that lhRemovalAllowed is returned by relocateOne
Now that this has been pushed into relocateOne, it's surprising while reading this that lhRemovalAllowed imples len(ops) > 0. Do we even need this lhRemovalAllowed flag? Can relocateOne use it internally to determine whether to even return a leaseTarget?
3b4cbf2 to
7599f2a
Compare
7599f2a to
588f76e
Compare
nvb
left a comment
There was a problem hiding this comment.
Reviewed 4 of 4 files at r7, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aayushshah15 and @shralex)
pkg/kv/kvserver/replica_command.go, line 3151 at r7 (raw file):
} if len(ops) == 0 && transferLeaseToFirstVoter && !lhRemovalAllowed {
If len(ops) == 0, do we need to check lhRemovalAllowed?
588f76e to
2db6134
Compare
nvb
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r8, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15)
aayushshah15
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @aayushshah15)
… replica In cockroachdb#74077 we allowed entering a JOINT config when the leaseholder is being demoted, in which case we attempted to transfer the lease away before leaving the JOINT config. Looking for a lease transfer target at that stage created flakiness, since in some situations this target isn't found and we need to retry. This PR takes the approach that when the leaseholder is being removed, we should enter the JOINT config only if there is a VOTER_INCOMING replica. In that case, the lease is transferred directly to that replica, without further checks. And otherwise, the lease must be transferred before attempting to remove the leaseholder. Release justification: fixes flakiness caused by cockroachdb#74077 Release note: None
2db6134 to
40fd5d3
Compare
|
bors r+ |
|
Build failed (retrying...): |
|
Build failed (retrying...): |
|
bors r+ |
|
Already running a review |
|
Build failed (retrying...): |
|
Build succeeded: |
…when it isn't In cockroachdb#77841, we only remove the leaseholder if there is a VOTER_INCOMING replica in the same JOINT config. But it is also possible that a node A was the lesaeholder, and then the lease gets transferred to node B, who tries to remove A. Since B isn't getting removed, it is fine to enter a JOINT config without a VOTER_INCOMING node. But A might still believe its the leaseholder, and this might cause it to crash when it finds out that its in a JOINT config and there is no INCOMING node. In this case, we can just continue, since an error will be thrown in replica_raft.go Release note: None Release justification: fixes flakiness caused by cockroachdb#77841
…when it isn't In cockroachdb#77841, we only remove the leaseholder if there is a VOTER_INCOMING replica in the same JOINT config. But it is also possible that a node A was the lesaeholder, and then the lease gets transferred to node B, who tries to remove A. Since B isn't getting removed, it is fine to enter a JOINT config without a VOTER_INCOMING node. But A might still believe its the leaseholder, and this might cause it to crash when it finds out that its in a JOINT config and there is no INCOMING node. In this case, we can just continue, since an error will be thrown in replica_raft.go Release note: None Release justification: fixes flakiness caused by cockroachdb#77841
78438: kvserver: replica might crash evaluating a condition as leaseholder, … r=shralex a=shralex …when it isn't In #77841, we only remove the leaseholder if there is a VOTER_INCOMING replica in the same JOINT config. But it is also possible that a node A was the lesaeholder, and then the lease gets transferred to node B, who tries to remove A. Since B isn't getting removed, it is fine to enter a JOINT config without a VOTER_INCOMING node. But A might still believe its the leaseholder, and this might cause it to crash when it finds out that its in a JOINT config and there is no INCOMING node. In this case, we can just continue, since an error will be thrown in replica_raft.go This resolves #77937 Release note: None Release justification: fixes flakiness caused by #77841 Co-authored-by: shralex <shralex@gmail.com>
|
We intend on backporting this PR and #78438 to |
…when it isn't In cockroachdb#77841, we only remove the leaseholder if there is a VOTER_INCOMING replica in the same JOINT config. But it is also possible that a node A was the lesaeholder, and then the lease gets transferred to node B, who tries to remove A. Since B isn't getting removed, it is fine to enter a JOINT config without a VOTER_INCOMING node. But A might still believe its the leaseholder, and this might cause it to crash when it finds out that its in a JOINT config and there is no INCOMING node. In this case, we can just continue, since an error will be thrown in replica_raft.go Release note: None Release justification: fixes flakiness caused by cockroachdb#77841
… replica
In #74077 we allowed entering a JOINT
config when the leaseholder is being demoted, in which case we attempted to transfer the lease
away before leaving the JOINT config. Looking for a lease transfer target at that stage created
flakiness, since in some situations this target isn't found and we need to retry. This PR
takes the approach that when the leaseholder is being removed, we should enter the JOINT config
only if there is a VOTER_INCOMING replica. In that case, the lease is transferred directly to
that replica, without further checks. And otherwise, the lease must be transferred before attempting
to remove the leaseholder.
Release justification: fixes flakiness caused by #74077
Release note: None
Experiments to demonstrate that the benefits of #74077
remain with this PR. Please refer to the #74077 for details.