kvserver: allow lease extensions from followers in propBuf#61982
kvserver: allow lease extensions from followers in propBuf#61982craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
91f0e3a to
1239745
Compare
erikgrinaker
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @nvanbenschoten, and @tbg)
pkg/roachpb/api.go, line 281 at r1 (raw file):
func (rlr *RequestLeaseRequest) IsExtension() bool { return rlr.PrevLease.Replica.StoreID == rlr.Lease.Replica.StoreID && rlr.PrevLease.Equivalent(rlr.Lease)
I had to add Equivalent() as a condition here, such that a lease request in a new epoch is not considered an extension of a lease in a previous epoch. Not sure if this makes sense or what the implications are, would appreciate some extra eyes on it.
This was added to make TestRequestsOnLaggingReplica pass -- otherwise, when a partitioned leader rejoined the cluster in a later epoch it would consider its lease request to be an extension of its previous lease (in the previous epoch), causing a request to the old leader to block rather than redirect to the new leader. I believe this scenario was the entire motivation for rejecting lease acquisition proposals in the first place.
Equivalent() does not check whether expiration-based leases have expired (only epoch-based leases), which I think could cause the same problem for system ranges. We could e.g. check it against the current time, or even just compare with the current local lease in propBuf.leaseStatus to check whether the lease extension proposal is equivalent and still valid.
tbg
left a comment
There was a problem hiding this comment.
Reviewed 3 of 3 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)
pkg/roachpb/api.go, line 281 at r1 (raw file):
Previously, erikgrinaker (Erik Grinaker) wrote…
I had to add
Equivalent()as a condition here, such that a lease request in a new epoch is not considered an extension of a lease in a previous epoch. Not sure if this makes sense or what the implications are, would appreciate some extra eyes on it.This was added to make
TestRequestsOnLaggingReplicapass -- otherwise, when a partitioned leader rejoined the cluster in a later epoch it would consider its lease request to be an extension of its previous lease (in the previous epoch), causing a request to the old leader to block rather than redirect to the new leader. I believe this scenario was the entire motivation for rejecting lease acquisition proposals in the first place.
Equivalent()does not check whether expiration-based leases have expired (only epoch-based leases), which I think could cause the same problem for system ranges. We could e.g. check it against the current time, or even just compare with the current local lease inpropBuf.leaseStatusto check whether the lease extension proposal is equivalent and still valid.
This makes sense, and you have a good point regarding expiration-based leases. It almost seems as though we should include an epoch even in expiration-based leases, unfortunately this isn't going to work since a zero epoch is what encodes a lease as expiration-based:
Lines 1902 to 1904 in e924d91
Comparing against anything from the local replica state isn't going to help us with the problem of requesting a lease based on a stale lease (& leaving requests waiting for our replica to catch up), so it would have to be a timestamp-based route that we're taking. This is mildly annoying as well, as there's no obvious way to distinguish a lease extension (after the range has been dormant for a while) from the kind of extension we want to block. @nvanbenschoten do you have an idea? Maybe adding an auxiliary epoch field that's only used for expiration-based leases? I don't like the look of that but it seems better than trying to use timestamps.
andreimatei
left a comment
There was a problem hiding this comment.
Please make the commit message explicit about the fact that the change is only about expiration-based leases (right?)
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @nvanbenschoten)
pkg/kv/kvserver/replica_proposal_buf.go, line 581 at r1 (raw file):
// also send lease extensions for an existing leaseholder. if isExtension { log.VEventf(ctx, 4, "proposing lease extension even though we're not the leader")
nit: 4 is to high, generally only used for extremely verbose messages in loops. Use the more usual 2.
pkg/kv/kvserver/replica_proposal_buf_test.go, line 269 at r1 (raw file):
var data []byte if leaseReq { pd, data = pc.newLeaseProposal(roachpb.Lease{}, false)
false /* newLeaseExtension */ to be in compliance with the style guide
But much better to give them names:
type leaseOpt bool
const (
brandNewLease leaseOpt = false
extension = true
)
pkg/roachpb/api.go, line 278 at r1 (raw file):
// IsExtension returns whether the lease request is an extension of an existing // lease or whether it is claiming the lease away from a previous owner.
nit: s/or whether it is claiming the lease away from a previous owner//
The disjunction is confusing because it's not clear if it refers to the cases where IsExtension returns true or if it refers to the distinction between true or false. Consider clarifying, or better yet just remove the latter part because it's obvious enough.
pkg/roachpb/api.go, line 281 at r1 (raw file):
Equivalent() does not check whether expiration-based leases have expired (only epoch-based leases), which I think could cause the same problem for system ranges. We could e.g. check it against the current time, or even just compare with the current local lease in propBuf.leaseStatus to check whether the lease extension proposal is equivalent and still valid.
Equivalentdoesn't check whether either type of lease has expired - it doesn't even have access to a clock.
Line 1927 in e924d91
But otherwise I think you're right - I think we should check whether the current lease is valid. That's what it should mean for a "lease" to be an extension. One can't extend a lease it no longer has.
nvb
left a comment
There was a problem hiding this comment.
Reviewed 1 of 3 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @erikgrinaker)
pkg/kv/kvserver/replica_proposal_buf_test.go, line 269 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
false /* newLeaseExtension */to be in compliance with the style guide
But much better to give them names:type leaseOpt bool const ( brandNewLease leaseOpt = false extension = true )
+1 for false /* newLeaseExtension */. Please don't introduce an enum for a test helper function that's called 4 times, all in the same file.
pkg/kv/kvserver/replica_proposal_buf_test.go, line 591 at r1 (raw file):
}, { name: "follower, lease extension to known eligible leader",
This name seems a little off. The follower is the one who holds the lease and is performing an extension, right? So should it be follower, lease extension despite known eligible leader or something along those lines?
pkg/roachpb/api.go, line 281 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
Equivalent() does not check whether expiration-based leases have expired (only epoch-based leases), which I think could cause the same problem for system ranges. We could e.g. check it against the current time, or even just compare with the current local lease in propBuf.leaseStatus to check whether the lease extension proposal is equivalent and still valid.
Equivalentdoesn't check whether either type of lease has expired - it doesn't even have access to a clock.
Line 1927 in e924d91
But otherwise I think you're right - I think we should check whether the current lease is valid. That's what it should mean for a "lease" to be an extension. One can't extend a lease it no longer has.
This is an interesting dilemma, as it strikes at the question of what constitutes a lease transfer and what lease requests are safe from getting wedged. I don't think the two sets are equivalent.
We consider a lease "extension" at different parts of the code to be one that does not change the lease's owner. We make no attempt to say that the existing lease is still valid - if it is not and has already been replaced, the lease extension will fail.
An argument could be made that a lease that is replaced by a new lease with the same owner but a new sequence is not a true "extension", because writes proposed under the old lease will be rejected if reordered and applying after the new lease. This is the case during a node restart, where the node's minLeaseProposedTS is bumped, triggering this logic to force a new lease sequence number - the increased start time leads to !Equivalent() here. If we wanted to detect this case specifically from the lease request, we could compare PrevLease.Start to MinProposedTS.
But neither of these definitions help classify when a lease request is safe from getting wedged or not. In either case, it would be possible for the leaseholder to have been partitioned, replaced non-cooperatively, and fallen far behind on its log. What it sounds like you're getting at is that to feel sufficiently comfortable ignoring the lease-request-on-follower check, we'd want to make sure that the leaseholder couldn't have lost its lease non-cooperatively to some other replica. This gives us a reasonable assurance that the proposing replica doesn't have a wall of log entries waiting for it behind a network partition that it will need to wait on before hearing back about its lease request.
If that's the case, then maybe we shouldn't be worrying about the request and whether it looks like an extension at all. Maybe we should just be checking (*Replica).ownsValidLeaseRLocked in FlushLockedWithRaftGroup to determine whether we should allow a lease request to proceed on a follower.
erikgrinaker
left a comment
There was a problem hiding this comment.
Please make the commit message explicit about the fact that the change is only about expiration-based leases
It's about epoch-based leases too. The initial suggestion was to forward all lease extensions (as defined by the previous and current lease having the same store ID in the lease request). However, this broke TestRequestsOnLaggingReplica since an epoch-based lease was being reclaimed by an old leaseholder in a new epoch, which was considered an extension, blocking requests to the old leaseholder.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @erikgrinaker, @nvanbenschoten, and @tbg)
pkg/kv/kvserver/replica_proposal_buf.go, line 581 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
nit:
4is to high, generally only used for extremely verbose messages in loops. Use the more usual2.
Sure. I figured this was a fairly benign scenario compared to the others, and might be unnecessarily noisy since it'd be output on every lease extension, but will bump it to 2.
pkg/kv/kvserver/replica_proposal_buf_test.go, line 269 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
+1 for
false /* newLeaseExtension */. Please don't introduce an enum for a test helper function that's called 4 times, all in the same file.
Will do.
pkg/kv/kvserver/replica_proposal_buf_test.go, line 591 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This name seems a little off. The follower is the one who holds the lease and is performing an extension, right? So should it be
follower, lease extension despite known eligible leaderor something along those lines?
Makes sense, thanks.
pkg/roachpb/api.go, line 281 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This is an interesting dilemma, as it strikes at the question of what constitutes a lease transfer and what lease requests are safe from getting wedged. I don't think the two sets are equivalent.
We consider a lease "extension" at different parts of the code to be one that does not change the lease's owner. We make no attempt to say that the existing lease is still valid - if it is not and has already been replaced, the lease extension will fail.
An argument could be made that a lease that is replaced by a new lease with the same owner but a new sequence is not a true "extension", because writes proposed under the old lease will be rejected if reordered and applying after the new lease. This is the case during a node restart, where the node's
minLeaseProposedTSis bumped, triggering this logic to force a new lease sequence number - the increased start time leads to!Equivalent()here. If we wanted to detect this case specifically from the lease request, we could comparePrevLease.StarttoMinProposedTS.But neither of these definitions help classify when a lease request is safe from getting wedged or not. In either case, it would be possible for the leaseholder to have been partitioned, replaced non-cooperatively, and fallen far behind on its log. What it sounds like you're getting at is that to feel sufficiently comfortable ignoring the lease-request-on-follower check, we'd want to make sure that the leaseholder couldn't have lost its lease non-cooperatively to some other replica. This gives us a reasonable assurance that the proposing replica doesn't have a wall of log entries waiting for it behind a network partition that it will need to wait on before hearing back about its lease request.
If that's the case, then maybe we shouldn't be worrying about the request and whether it looks like an extension at all. Maybe we should just be checking
(*Replica).ownsValidLeaseRLockedinFlushLockedWithRaftGroupto determine whether we should allow a lease request to proceed on a follower.
Yep, great summary. I think ownsValidLeaseRLocked() might be the right solution here, but I wasn't sure if we could sometimes be proxying lease requests from other replicas or if the proposal buffer was exclusively for proposals the node generated itself. I'll try implementing this.
1239745 to
26584e4
Compare
erikgrinaker
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @erikgrinaker, @nvanbenschoten, and @tbg)
pkg/roachpb/api.go, line 281 at r1 (raw file):
Previously, erikgrinaker (Erik Grinaker) wrote…
Yep, great summary. I think
ownsValidLeaseRLocked()might be the right solution here, but I wasn't sure if we could sometimes be proxying lease requests from other replicas or if the proposal buffer was exclusively for proposals the node generated itself. I'll try implementing this.
Updated the PR, let me know what you think.
26584e4 to
1beb8c6
Compare
andreimatei
left a comment
There was a problem hiding this comment.
Please make the commit message explicit about the fact that the change is only about expiration-based leases
It's about epoch-based leases too. The initial suggestion was to forward all lease extensions (as defined by the previous and current lease having the same store ID in the lease request). However, this broke TestRequestsOnLaggingReplica since an epoch-based lease was being reclaimed by an old leaseholder in a new epoch, which was considered an extension, blocking requests to the old leaseholder.
But we've now established that "as defined by the previous and current lease having the same store ID" is the wrong definition for an extension, right? So what remains only affects expiration-based leases, I think? Like, there's no such thing as "an extension" for epoch-based leases, is there? For epoch-based ranges, a replica with a valid lease never proposes a new lease (does it?).
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @erikgrinaker, @nvanbenschoten, and @tbg)
pkg/kv/kvserver/replica_proposal_buf.go, line 567 at r2 (raw file):
// lease. In that case, we have no choice but to continue with the proposal. // // Lease extensions for a currently held lease always go through.
please explain why they always go through in this comment
pkg/kv/kvserver/replica_proposal_buf.go, line 571 at r2 (raw file):
req := p.Request.Requests[0].GetRequestLease() leaderKnownAndEligible := leaderInfo.leaderKnown && leaderInfo.leaderEligibleForLease isValidExtension := req.PrevLease.Replica.StoreID == req.Lease.Replica.StoreID &&
nit: I'd strike the "valid" from isValidExtension. What's an "invalid extension"?
pkg/kv/kvserver/replica_proposal_buf.go, line 571 at r2 (raw file):
req := p.Request.Requests[0].GetRequestLease() leaderKnownAndEligible := leaderInfo.leaderKnown && leaderInfo.leaderEligibleForLease isValidExtension := req.PrevLease.Replica.StoreID == req.Lease.Replica.StoreID &&
is the req.PrevLease.Replica.StoreID == req.Lease.Replica.StoreID part necessary? What would go wrong if we just removed it?
RequestLeaseRequest always asks for a lease for the sender of the respective request, not for some other rando (unless you know otherwise?).
nvb
left a comment
There was a problem hiding this comment.
Thanks @erikgrinaker!
I think we can also address a TODO in setupLeaseTransferTest that references this issue. But when doing so, please stress TestLeaseExpirationBelowFutureTimeRequest a bit to make sure we aren't introducing any flakiness.
make stress PKG=./pkg/kv/kvserver TESTS= TestLeaseExpirationBelowFutureTimeRequest
Reviewed 2 of 3 files at r2.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @erikgrinaker)
pkg/kv/kvserver/replica_proposal_buf.go, line 581 at r1 (raw file):
Previously, erikgrinaker (Erik Grinaker) wrote…
Sure. I figured this was a fairly benign scenario compared to the others, and might be unnecessarily noisy since it'd be output on every lease extension, but will bump it to 2.
Should we explain why we're proposing like we do in other cases? ; we hold the current lease
pkg/kv/kvserver/replica_proposal_buf.go, line 571 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
is the
req.PrevLease.Replica.StoreID == req.Lease.Replica.StoreIDpart necessary? What would go wrong if we just removed it?
RequestLeaseRequestalways asks for a lease for the sender of the respective request, not for some other rando (unless you know otherwise?).
I was thinking the same thing. I don't see why we still need to look at the request at all.
pkg/kv/kvserver/replica_proposal_buf.go, line 571 at r2 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
nit: I'd strike the "valid" from
isValidExtension. What's an "invalid extension"?
I'd also rename this to ownsCurrentLease or something along those lines to reflect the salient point here.
pkg/kv/kvserver/replica_proposal_buf.go, line 1029 at r2 (raw file):
} func (rp *replicaProposer) ownsValidLeaseRLocked(ctx context.Context, now hlc.ClockTimestamp) bool {
nit: let's keep this below leaderStatusRLocked so that these method impls remain ordered.
pkg/kv/kvserver/replica_proposal_buf_test.go, line 269 at r1 (raw file):
Previously, erikgrinaker (Erik Grinaker) wrote…
Will do.
Should we remove this now?
pkg/roachpb/api.go, line 281 at r1 (raw file):
Previously, erikgrinaker (Erik Grinaker) wrote…
Updated the PR, let me know what you think.
Looks great.
1beb8c6 to
1311f0c
Compare
|
TFTRs, I've addressed the comments. I tried removing the I find this pretty odd, because if I call |
nvb
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r3.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @andreimatei and @erikgrinaker)
andreimatei
left a comment
There was a problem hiding this comment.
but consider clarifying the commit message to say that this patch is only about expiration-based leases (if you buy what I said above).
I find this pretty odd, because if I call replica1.GetLease() it does indeed say that n2 is the leaseholder, and yet the request fails saying we're not. However, the current lease in the request above has seq=0 start=0,0 exp= -- could this be the initial lease, such that n2 somehow hasn't fully picked up the lease transfer yet?
The empty lease in the error makes me think that the replica that generated the error did not actually know the lease. Sometimes we generate NotLeaseholderErrors without having a lease on hand; I don't remember the exact cases.
Nit: it's convention to always respond with a "done" on the review comments after you've addressed them. Otherwise I don't know if they're still pending or not.
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @andreimatei and @erikgrinaker)
The Raft proposal buffer currently rejects lease acquisition proposals from followers if there is an eligible leader, to avoid interfering with it trying to claim the lease. However, this logic unintentionally rejected lease extension proposals from followers as well, which could cause a leaseholder to lose its lease. This patch changes the proposal buffer to allow lease extension proposals from followers for extension-based leases as long as they still hold a valid lease. Release note: None
1311f0c to
a59fd79
Compare
erikgrinaker
left a comment
There was a problem hiding this comment.
The empty lease in the error makes me think that the replica that generated the error did not actually know the lease.
Ah, of course -- the error was from the other replica. Removed the test knob and fixed TestLeaseExpirationBelowFutureTimeRequest by waiting for both replicas to pick up on the lease transfer.
it's convention to always respond with a "done" on the review comments after you've addressed them.
Got it, will do! Still getting used to Reviewable.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @andreimatei and @erikgrinaker)
pkg/kv/kvserver/replica_proposal_buf.go, line 571 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I was thinking the same thing. I don't see why we still need to look at the request at all.
Done.
pkg/kv/kvserver/replica_proposal_buf_test.go, line 269 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should we remove this now?
Done.
|
bors r=nvanbenschoten,andreimatei,tbg |
|
Build failed (retrying...): |
|
Build failed (retrying...): |
|
Build succeeded: |
The Raft proposal buffer currently rejects lease acquisition proposals
from followers if there is an eligible leader, to avoid interfering with
it trying to claim the lease. However, this logic unintentionally
rejected lease extension proposals from followers as well, which could
cause a leaseholder to lose its lease.
This patch changes the proposal buffer to allow lease extension
proposals from followers for extension-based leases as long as they
still hold a valid lease.
Resolves #59179. Follow-up to #55148.
Release note: None