Skip to content

*: implement replication admission control #95563

@irfansharif

Description

@irfansharif

This is the tracking issue to merge the prototype for replication admission control: #93102. Internal experiments (see #admission-control) demonstrate its ability to provide throughput isolation in the face of large index backfills, where none exist today. The motivating issues are #82556 and #85641. The design doc for this work can be found internally. We expect the work here to break down into the following PRs (-ish, and in no particular order):

  • raftlog: introduce EntryEncoding{Standard,Sideloaded}WithAC #95748. This includes changes to raft encodings, needed for the protocol changes described next.
  • kvflowcontrol,raftlog: interfaces for replication control #95637. Protocol changes for data sent back and forth over the raft transport, tying flow token deductions to specific raft log positions (term+index). Include various interfaces.
    • See kvflowcontrol/doc.go in PR above. Tech-note/large overview comment for replication admission control.
  • kvflowcontroller: implement kvflowcontrol.Controller #95905. Implement kvflowcontrol.Controller: Per-node flow token bucket, internally segmented by work class, tenants and stores it's controlling write traffic to.
    • See pkg/util/asciitsdb and pkg/kv/kvserver/kvflowcontrol/kvflowcontroller:simulator in PR above. Write a simulator for the flow tokens component, showing how throughput varies across each unit as tokens flow back, etc. Do it for multiple tenants.
  • kvflowhandle: implement kvflowcontrol.Handle #96642. Implement kvflowcontrol.Handle: Per-replica tracker for flow token deductions. The lifecycle of this handle is tied to a leaseholder replica also being the raft leader.
  • kvflowcontrol: implement kvflowcontrol.Dispatch #97766. Implement kvflowcontrol.Dispatch: Message box used to dispatch information about admitted raft log entries to specific nodes.
  • admission: support non-blocking {Store,}WorkQueue.Admit() #97599. Implement kvadmission.AdmitRaftEntry: Change admission.{Store,}WorkQueue to support logical admission/enqueuing of virtual work items, an "async admit" interface.
  • kvserver,kvflowcontrol: integrate flow control #98308. Integrate various components end-to-end, and add cluster settings to disable replication admission control entirely, just for regular requests.
    • [ ] Also support a mode where we do end-to-end flow control token tracking but don't actually block at admit time due to lack of requisite flow tokens. It'll let us look at production systems and understand that we are losing performance isolation due to a lack flow control.
    • Support and add tests to make sure we don't leak flow tokens, or return them repeatedly, in the face of node failures, gRPC streams breaking (including intermittently), reproposals, snapshots, log truncations, splits, merges, lease transfers, leadership transfers, raft membership changing, follower pausing, prolonged leaseholder != leader, etc.

The remaining "rollout" steps (and the validation needed) is being tracked in #98703.

Epic: https://cockroachlabs.atlassian.net/browse/CRDB-16542

Jira issue: CRDB-23589

Epic CRDB-25348

Metadata

Metadata

Assignees

Labels

A-admission-controlC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-kvKV Team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions