Skip to content

kvserver: don't send snapshots to overloaded learners #85479

@erikgrinaker

Description

@erikgrinaker

In #83851, we prevent sending Raft snapshots to followers that have an overloaded engine:

repl.mu.RLock()
_, destPaused := repl.mu.pausedFollowers[id]
repl.mu.RUnlock()
if ioThresh := repl.store.ioOverloadedStores.Load()[repDesc.StoreID]; ioThresh != nil && destPaused {
// If the destination is paused, be more hesitant to send snapshots. The destination being
// paused implies that we have recently checked that it's not required for quorum, and that
// we wish to conserve I/O on that store, which sending a snapshot counteracts. So hold back on
// the snapshot as well.
err := errors.Errorf("skipping snapshot; %s is overloaded: %s", repDesc, ioThresh)
repl.reportSnapshotStatus(ctx, repDesc.ReplicaID, err)
return false, err
}

However, we don't do this when sending snapshots to new learners:

if err := r.sendSnapshot(ctx, rDesc, kvserverpb.SnapshotRequest_INITIAL, priority); err != nil {
return nil, err
}

We probably should. Erroring out here may be sufficient.

Jira issue: CRDB-18266

Epic CRDB-15069

Metadata

Metadata

Assignees

Labels

A-kv-replicationRelating to Raft, consensus, and coordination.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions