kvserver: actually renew expiration-based leases#65653
kvserver: actually renew expiration-based leases#65653craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
nvb
left a comment
There was a problem hiding this comment.
I wonder why no existing tests caught this. We surely have something the exercises this logic, right? If we disable this entirely, does any test fail? Were the tests passing because we did renew the expiration-based lease, but only once?
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @tbg)
pkg/kv/kvserver/store.go, line 1730 at r1 (raw file):
renewalDuration := s.cfg.RangeLeaseActiveDuration() / 5 for { var n int
nit: the purpose of n wasn't immediately clear to me. Consider numRenewableLeases. Then it's easier to understand that we only reset the timer if we have at least one renewable lease, without having to read the logic in the loop.
pkg/kv/kvserver/store.go, line 1732 at r1 (raw file):
var n int s.renewableLeases.Range(func(k int64, v unsafe.Pointer) bool { n++
Do we want to want to add to the counter if we hit the s.renewableLeases.Delete(k) path?
tbg
left a comment
There was a problem hiding this comment.
I looked into this and the kvserver... tests do pass with the loop deactivated. This makes sense when I look at the original PR, which did not add a test: https://github.com/cockroachdb/cockroach/pull/25322/files
I added a test, PTAL.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @tbg)
pkg/kv/kvserver/store.go, line 1732 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Do we want to want to add to the counter if we hit the
s.renewableLeases.Delete(k)path?
I think that's fine - it'll give us an extra loop but that's basically free and keeps the code simpler.
nvb
left a comment
There was a problem hiding this comment.
Reviewed 4 of 4 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @tbg)
pkg/kv/kvserver/testing_knobs.go, line 340 at r2 (raw file):
// LeaseRenewalSignalChan populates `Store.renewableLeasesSignal`. LeaseRenewalSignalChan chan struct{} // LeaseRenewalOnPostCycle is invoked after each lease renewal cycle.
We should mention what the meaning is of the provided map.
We were never looping around more than once as a result of never resetting the timer. This was noticed by contributor @llllash in cockroachdb#65393. Thank you! Closes cockroachdb#65393. Co-authored-by: linyimin <linyimin@baidu.com> Release note: None
tbg
left a comment
There was a problem hiding this comment.
bors r=nvanbenschoten
TFTR!
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)
|
Build succeeded: |
We were never looping around more than once as a result of
never resetting the timer. This was noticed by contributor
@llllash in #65393. Thank you!
Closes #65393.
It seems as though the mechanism to prevent ping-ponging of the r1 lease never quite worked, i.e. this issue wasn't actually closed: #24753
It's unclear why this hasn't caused more trouble over the last three years. However, looking at the original issue, it seems like you need very precise timing to get the problem to appear. Maybe that is a good enough explanation.
Release note: None