-
Notifications
You must be signed in to change notification settings - Fork 4.1k
*: implement replication admission control #95563
Copy link
Copy link
Closed
Labels
A-admission-controlC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-kvKV TeamKV Team
Description
This is the tracking issue to merge the prototype for replication admission control: #93102. Internal experiments (see #admission-control) demonstrate its ability to provide throughput isolation in the face of large index backfills, where none exist today. The motivating issues are #82556 and #85641. The design doc for this work can be found internally. We expect the work here to break down into the following PRs (-ish, and in no particular order):
- raftlog: introduce EntryEncoding{Standard,Sideloaded}WithAC #95748. This includes changes to raft encodings, needed for the protocol changes described next.
- kvflowcontrol,raftlog: interfaces for replication control #95637. Protocol changes for data sent back and forth over the raft transport, tying flow token deductions to specific raft log positions (term+index). Include various interfaces.
- See
kvflowcontrol/doc.goin PR above. Tech-note/large overview comment for replication admission control.
- See
- kvflowcontroller: implement kvflowcontrol.Controller #95905. Implement kvflowcontrol.Controller: Per-node flow token bucket, internally segmented by work class, tenants and stores it's controlling write traffic to.
- See
pkg/util/asciitsdbandpkg/kv/kvserver/kvflowcontrol/kvflowcontroller:simulatorin PR above. Write a simulator for the flow tokens component, showing how throughput varies across each unit as tokens flow back, etc. Do it for multiple tenants.
- See
- kvflowhandle: implement kvflowcontrol.Handle #96642. Implement kvflowcontrol.Handle: Per-replica tracker for flow token deductions. The lifecycle of this handle is tied to a leaseholder replica also being the raft leader.
- kvflowcontrol: implement kvflowcontrol.Dispatch #97766. Implement kvflowcontrol.Dispatch: Message box used to dispatch information about admitted raft log entries to specific nodes.
- admission: support non-blocking {Store,}WorkQueue.Admit() #97599. Implement kvadmission.AdmitRaftEntry: Change admission.{Store,}WorkQueue to support logical admission/enqueuing of virtual work items, an "async admit" interface.
- kvserver,kvflowcontrol: integrate flow control #98308. Integrate various components end-to-end, and add cluster settings to disable replication admission control entirely, just for regular requests.
[ ] Also support a mode where we do end-to-end flow control token tracking but don't actually block at admit time due to lack of requisite flow tokens. It'll let us look at production systems and understand that we are losing performance isolation due to a lack flow control.- Support and add tests to make sure we don't leak flow tokens, or return them repeatedly, in the face of node failures, gRPC streams breaking (including intermittently), reproposals, snapshots, log truncations, splits, merges, lease transfers, leadership transfers, raft membership changing, follower pausing, prolonged leaseholder != leader, etc.
The remaining "rollout" steps (and the validation needed) is being tracked in #98703.
Epic: https://cockroachlabs.atlassian.net/browse/CRDB-16542
Jira issue: CRDB-23589
Epic CRDB-25348
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
A-admission-controlC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-kvKV TeamKV Team