Skip to content

storage: backpressure writes when ranges get too large #21357

@nvb

Description

@nvb

We've seen in privately tracked issues that under certain workloads the splitQueue is unable to keep up with the stream of incoming writes and ranges are able to grow without bound. In the referenced issue, 400 concurrent writers performing 80KB writes on a single range was enough to overwhelm the system to the point where ranges grew for past the 64MB split size, some reaching sizes of more than 3GB. We have issues open to better handle ranges of this size so that we don't lose availability in such cases (see #16954), but we should be more proactive about preventing these range sizes in the first place.

We should introduce a form of backpressure into the replica write path such that writes will synchronously block on pending splits once a range gets too large. This backpressure needs to come before the writes reach the CommandQueue to avoid creating contention or possible deadlocks, and we'll need to be careful to avoid throttling the split itself or any other requests that are needed to maintain replica health.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions