-
Notifications
You must be signed in to change notification settings - Fork 4.1k
kvserver: incorrect DistSender leaseholder processing for non-VOTERs #85060
Description
ReplicaDescriptor.Type is a pointer rather than a scalar value:
cockroach/pkg/roachpb/metadata.proto
Line 151 in aee773a
| optional ReplicaType type = 4 [(gogoproto.moretags) = "yaml:\"ReplicaType,omitempty\""]; |
We often use regular equality comparisons on replica descriptors, notably in the DistSender and gRPC transport:
cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go
Line 2209 in f77e393
| if *lh != curReplica || sameReplicaRetries < sameReplicaRetryLimit { |
| if gt.replicas[i] == replica { |
But recall that equality for pointers will check that the memory address is the same, rather than that the value is the same. Thus two identical replica descriptors with a type different from VOTER (the zero value) will be considered unequal if they are stored in different memory locations.
One immediate effect of this is that the DistSender will not prioritize new leaseholders with a VOTER_INCOMING type. In #74546, VOTER_INCOMING was allowed to receive the lease and thus included in the DistSender's replica list with the RoutingPolicy_LEASEHOLDER policy:
cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go
Lines 1932 to 1943 in f77e393
| // Filter the replicas to only those that are relevant to the routing policy. | |
| var replicaFilter ReplicaSliceFilter | |
| switch ba.RoutingPolicy { | |
| case roachpb.RoutingPolicy_LEASEHOLDER: | |
| replicaFilter = OnlyPotentialLeaseholders | |
| case roachpb.RoutingPolicy_NEAREST: | |
| replicaFilter = AllExtantReplicas | |
| default: | |
| log.Fatalf(ctx, "unknown routing policy: %s", ba.RoutingPolicy) | |
| } | |
| leaseholder := routing.Leaseholder() | |
| replicas, err := NewReplicaSlice(ctx, ds.nodeDescs, desc, leaseholder, replicaFilter) |
However, when it receives the NLHE it attempts to move the new leaseholder to the front of the transport list:
cockroach/pkg/kv/kvclient/kvcoord/dist_sender.go
Lines 2201 to 2212 in f77e393
| if lh := routing.Leaseholder(); lh != nil { | |
| // If the leaseholder is the replica that we've just tried, and | |
| // we've tried this replica a bunch of times already, let's move on | |
| // and not try it again. This prevents us getting stuck on a replica | |
| // that we think has the lease but keeps returning redirects to us | |
| // (possibly because it hasn't applied its lease yet). Perhaps that | |
| // lease expires and someone else gets a new one, so by moving on we | |
| // get out of possibly infinite loops. | |
| if *lh != curReplica || sameReplicaRetries < sameReplicaRetryLimit { | |
| transport.MoveToFront(*lh) | |
| } | |
| } |
But this won't happen because the non-nil Type doesn't satisfy the equality check in MoveToFront:
| if gt.replicas[i] == replica { |
Jira issue: CRDB-18022