Skip to content

admission: graceful degradation #82114

@tbg

Description

@tbg

Is your feature request related to a problem? Please describe.

Today, the admission control subsystem protects a node "abruptly and severely". For example, for IO overload, as the L0 file or sublevel count slowly creeps towards the threshold (1000 and 20, respectively), no throttling will occur. As the threshold is crossed, admission control becomes active and throttles (and it does so in a way that doesn't necessarily minimize variance across requests, for example IO tokens are handed out at 1s intervals, so some requests may not be throttled at all and others for seconds at a time).

Describe the solution you'd like

CockroachDB should degrade gracefully: throttling should be introduced gently, as the thresholds are reached, and should throttle requests uniformly.

For example, for a kv0 workload in which the concurrency is ramped up over time, what we would hope to see is that p99s (of admission throttling over an interval) as a function of the concurrency (which is in 1:1 correspondence to targeted throughput) forms a smooth upward curve as opposed to a choppy step function.

Describe alternatives you've considered

Additional context

Related to #79215
Related to #81834

Jira issue: CRDB-16217

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-admission-controlC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)O-postmortemOriginated from a Postmortem action item.T-kvKV Team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions