Skip to content

kv: paused followers interacts poorly with leader-not-leaseholder state #84884

@nvb

Description

@nvb

In a cluster that was overloaded due to an index backfill, I saw a large number of ranges serving foreground traffic begin hitting circuit breaker errors. Digging in deeper, it became clear that all ranges in this state had split leaders and leaseholders, and the leaseholders were on an overloaded node. Earlier in the test, I had set admission.kv.pause_replication_io_threshold to 0.8 to avoid replicating to overloaded followers.

The combination of paused replicas and the leader-not-leaseholder split effectively caused unavailability. Even though the leaseholder could propose writes, it would never hear about their result, so it would never acknowledge the result of those writes to clients.

Should we allow the leaseholder to be paused?


On a related note, I noticed that while all nodes had a non-zero value for the admission.raft.paused_replicas, the overloaded node itself (purple) had a few spikes where it reported thousands of paused replicas. My understanding is that this metric is reported from leader side, not the follower side, so I don't understand this. Is there any reason why an overloaded node would start pausing other replicas?

Screen Shot 2022-07-21 at 5 51 46 PM

Also, the description of this metric says: The count is emitted by the leaseholder of each range.

Should this instead say: The count is emitted by the leader of each range.

Jira issue: CRDB-17924

Epic CRDB-15069

Metadata

Metadata

Assignees

Labels

A-kv-replicationRelating to Raft, consensus, and coordination.C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions