-
Notifications
You must be signed in to change notification settings - Fork 4.1k
kvserver: lease maintenance scheduler #98433
Description
We need a scheduler that eagerly maintains range leases. This should:
- Acquire leases for ranges that don't have one.
- Extend expiration-based leases before they expire.
- Switch the lease type when appropriate (see kvserver: explore setting to only use expiration-based leases #93903).
There are two motivations for this
-
We don't want lease acquisition to be lazily driven by client requests.
- It can prevent ranges from ever acquiring a lease under certain failure modes, e.g. because the client request times out before the lease acquisition succeeds under disk stalls (see kvserver: disk stall prevents lease transfer #81100 and kvserver: persistent outage when liveness leaseholder deadlocks #80713).
- It adds unnecessary latency, e.g. in the case where a scan across many ranges has to sequentially acquire leases for each range, especially under failure modes such as network outages or disk stalls where the lease acquisition has to wait for network timeouts.
-
When only using expiration leases, we want to ensure ranges have a lease even when there is no traffic on the ranges, to avoid lease acquisition latencies on the next request to the range.
A few important aspects to consider:
- Some ranges are more important that others. In particular, the meta and liveness range leases must get priority.
- Expiration lease extensions are currently expensive (one Raft write per extension per range). We may want to allow very cold ranges to let their leases expire after some period of inactivity (e.g. minutes).
- There may not be any point in quiescing ranges where we eagerly extend expiration leases (see kvserver: explore quiescence removal #94592 and kvserver: don't quiesce ranges with expiration-based leases #94454).
- We already have other scheduler infrastructure that we could piggyback on: the Raft scheduler, and the queue infrastructure (e.g. a new lease queue).
- We should try to honour lease preferences.
We have a few similar mechanisms already, that should be mostly be replaced by this scheduler:
-
The replicate queue acquires leases for ranges that doesn't have one.
cockroach/pkg/kv/kvserver/replicate_queue.go
Lines 989 to 996 in 736a67e
// TODO(kvoli): This check should fail if not the leaseholder. In the case // where we want to use the replicate queue to acquire leases, this should // occur before planning or as a result. In order to return this in // planning, it is necessary to simulate the prior change having succeeded // to then plan this lease transfer. if _, pErr := repl.redirectOnOrAcquireLease(ctx); pErr != nil { return change, pErr.GoError() } -
Store.startLeaseRenewer()eagerly renews the expiration leases on the meta and liveness ranges to avoid high tail latencies.cockroach/pkg/kv/kvserver/store.go
Lines 2089 to 2098 in 8add7c5
// startLeaseRenewer runs an infinite loop in a goroutine which regularly // checks whether the store has any expiration-based leases that should be // proactively renewed and attempts to continue renewing them. // // This reduces user-visible latency when range lookups are needed to serve a // request and reduces ping-ponging of r1's lease to different replicas as // maybeGossipFirstRange is called on each (e.g. #24753). func (s *Store) startLeaseRenewer(ctx context.Context) { // Start a goroutine that watches and proactively renews certain // expiration-based leases. -
Replica.maybeExtendLeaseAsyncLocked()will extend an expiration lease when processing a request in the last half of the lease interval.cockroach/pkg/kv/kvserver/replica_range_lease.go
Lines 1437 to 1454 in 736a67e
func (r *Replica) maybeExtendLeaseAsyncLocked(ctx context.Context, st kvserverpb.LeaseStatus) { // Check shouldExtendLeaseRLocked again, because others may have raced to // extend the lease and beaten us here after we made the determination // (under a shared lock) that the extension was needed. if !r.shouldExtendLeaseRLocked(st) { return } if log.ExpensiveLogEnabled(ctx, 2) { log.Infof(ctx, "extending lease %s at %s", st.Lease, st.Now) } // We explicitly ignore the returned handle as we won't block on it. // // TODO(tbg): this ctx is likely cancelled very soon, which will in turn // cancel the lease acquisition (unless joined by another more long-lived // ctx). So this possibly isn't working as advertised (which only plays a role // for expiration-based leases, at least). _ = r.requestLeaseLocked(ctx, st) }
Jira issue: CRDB-25265
Epic CRDB-25207