Skip to content

admission control: consider write stalls due to memtable count #77357

@sumeerbhola

Description

@sumeerbhola

This came up in the investigation of #72375 (comment)

The write overload protection for a store looks at number of files and sub-levels in L0. In the aforementioned roachtest we have seen write overload resulting in write stalls due to high memtable counts. This results in high latency, after a request is admitted.

Doing something for this is tricky: we need to know how fast the system can flush, since that is needed for the dynamic token computation. Using tokens that are based on bytes, and not 1 token per request, would also help here. Our current compaction based token calculation is easier: it allows us more margin for error since there is no sudden hiccup in writing if the calculation admits too many requests -- we can correct it in the next interval.

Jira issue: CRDB-13548

Epic CRDB-14607

Metadata

Metadata

Assignees

Labels

A-admission-controlA-storageRelating to our storage engine (Pebble) on-disk storage.C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)O-sreFor issues SRE opened or otherwise cares about tracking.T-storageStorage Team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions