-
Notifications
You must be signed in to change notification settings - Fork 4.1k
kvserver: don't send snapshots to overloaded learners #85479
Copy link
Copy link
Closed
Labels
A-kv-replicationRelating to Raft, consensus, and coordination.Relating to Raft, consensus, and coordination.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Description
In #83851, we prevent sending Raft snapshots to followers that have an overloaded engine:
cockroach/pkg/kv/kvserver/raft_snapshot_queue.go
Lines 142 to 153 in acdf42a
| repl.mu.RLock() | |
| _, destPaused := repl.mu.pausedFollowers[id] | |
| repl.mu.RUnlock() | |
| if ioThresh := repl.store.ioOverloadedStores.Load()[repDesc.StoreID]; ioThresh != nil && destPaused { | |
| // If the destination is paused, be more hesitant to send snapshots. The destination being | |
| // paused implies that we have recently checked that it's not required for quorum, and that | |
| // we wish to conserve I/O on that store, which sending a snapshot counteracts. So hold back on | |
| // the snapshot as well. | |
| err := errors.Errorf("skipping snapshot; %s is overloaded: %s", repDesc, ioThresh) | |
| repl.reportSnapshotStatus(ctx, repDesc.ReplicaID, err) | |
| return false, err | |
| } |
However, we don't do this when sending snapshots to new learners:
cockroach/pkg/kv/kvserver/replica_command.go
Lines 1801 to 1803 in acdf42a
| if err := r.sendSnapshot(ctx, rDesc, kvserverpb.SnapshotRequest_INITIAL, priority); err != nil { | |
| return nil, err | |
| } |
We probably should. Erroring out here may be sufficient.
Jira issue: CRDB-18266
Epic CRDB-15069
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
A-kv-replicationRelating to Raft, consensus, and coordination.Relating to Raft, consensus, and coordination.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)