-
Notifications
You must be signed in to change notification settings - Fork 4.1k
kvserver: log warning when lease transfer gets visible with a delay #95991
Description
Describe the problem
Transferring leases to unavailable/behind replicas causes outages. In many cases we investigate, when this happens the new leaseholder is waiting for a raft snapshot and until it receives the snapshot will not apply the log up to its lease, but just being behind on a long piece of raft log could essentially have the same effect (though the quota pool should prevent this but probably doesn't always).
When a lease transfer is applied by the recipient of the lease, and the proposed timestamp is (say) >500ms behind "now", the lease recipient should log. This would make it much easier to determine when a lease transfer might have caused a latency spike or period of unavailability.
I believe we could do all of this in this code:
cockroach/pkg/kv/kvserver/replica_proposal.go
Lines 223 to 228 in 7ece76a
| func (r *Replica) leasePostApplyLocked( | |
| ctx context.Context, | |
| prevLease, newLease *roachpb.Lease, | |
| priorReadSum *rspb.ReadSummary, | |
| jumpOpt leaseJumpOption, | |
| ) { |
Jira issue: CRDB-23884