Skip to content

kvserver: incorrect DistSender leaseholder processing for non-VOTERs #85060

@erikgrinaker

Description

@erikgrinaker

ReplicaDescriptor.Type is a pointer rather than a scalar value:

optional ReplicaType type = 4 [(gogoproto.moretags) = "yaml:\"ReplicaType,omitempty\""];

We often use regular equality comparisons on replica descriptors, notably in the DistSender and gRPC transport:

if *lh != curReplica || sameReplicaRetries < sameReplicaRetryLimit {

if gt.replicas[i] == replica {

But recall that equality for pointers will check that the memory address is the same, rather than that the value is the same. Thus two identical replica descriptors with a type different from VOTER (the zero value) will be considered unequal if they are stored in different memory locations.

One immediate effect of this is that the DistSender will not prioritize new leaseholders with a VOTER_INCOMING type. In #74546, VOTER_INCOMING was allowed to receive the lease and thus included in the DistSender's replica list with the RoutingPolicy_LEASEHOLDER policy:

// Filter the replicas to only those that are relevant to the routing policy.
var replicaFilter ReplicaSliceFilter
switch ba.RoutingPolicy {
case roachpb.RoutingPolicy_LEASEHOLDER:
replicaFilter = OnlyPotentialLeaseholders
case roachpb.RoutingPolicy_NEAREST:
replicaFilter = AllExtantReplicas
default:
log.Fatalf(ctx, "unknown routing policy: %s", ba.RoutingPolicy)
}
leaseholder := routing.Leaseholder()
replicas, err := NewReplicaSlice(ctx, ds.nodeDescs, desc, leaseholder, replicaFilter)

However, when it receives the NLHE it attempts to move the new leaseholder to the front of the transport list:

if lh := routing.Leaseholder(); lh != nil {
// If the leaseholder is the replica that we've just tried, and
// we've tried this replica a bunch of times already, let's move on
// and not try it again. This prevents us getting stuck on a replica
// that we think has the lease but keeps returning redirects to us
// (possibly because it hasn't applied its lease yet). Perhaps that
// lease expires and someone else gets a new one, so by moving on we
// get out of possibly infinite loops.
if *lh != curReplica || sameReplicaRetries < sameReplicaRetryLimit {
transport.MoveToFront(*lh)
}
}

But this won't happen because the non-nil Type doesn't satisfy the equality check in MoveToFront:

if gt.replicas[i] == replica {

Jira issue: CRDB-18022

Metadata

Metadata

Assignees

Labels

A-kv-replicationRelating to Raft, consensus, and coordination.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions