Filesystem buffering filter extension

The idea of this extension is to add more [flow control filters](https://github.com/envoyproxy/envoy/blob/main/source/docs/flow_control.md) to support buffering to disk.

The filesystem access will be asynchronous (using a small thread pool) so as not to block the envoy worker.

This would be a filter you could place before a [buffer_filter](https://github.com/envoyproxy/envoy/blob/9dd8ae3c69a9917d97a886ed03e1c010dcd9b098/source/extensions/filters/http/buffer/buffer_filter.cc), which waits for its predecessor to have buffered an entire request then potentially injects content_length into the headers.

To avoid degrading performance, the proposed `storage_buffer_filter` should not place anything into storage until a configured memory cache limit is reached (i.e. it should act as a simple pass-through filter if it is not needed for a given stream).

An additional concern is if the disk writing is blocked (e.g. by other IO) for an extended period, the behavior could degrade into an unintentional all-memory cache. As such an additional configurable value might be required for when to consider that the buffer itself requires sending a HighWatermark alert.

Suggested configuration options for this filter:
* memory_buffer_bytes_limit
* storage_buffer_bytes_limit
* storage_buffer_queue_high_watermark_bytes (defaulting to == memory_buffer_bytes_limit, signifying how far *over* memory_buffer_bytes_limit can be queued in memory for writing to disk)

It should be possible for memory and storage buffers in the same filter to be interleaved, to account for a sequence such as:
1. configuration is 4 units of memory, 64 units of storage
2. source provides 8 units. 4 units are cached in memory, 4 units are cached to storage
3. recipient accepts 2 units. Now 2 units are cached in memory, 4 are cached in storage.
4. source provides 4 more units. 2 can be cached in the freed up memory, 2 in storage.
5. Now the desired output order is the first 2 from memory, 4 from storage, the other 2 from memory, the other 2 from storage.

Without allowing interleaving, the additional 2 units that only went in memory would have had to pass through storage unnecessarily.

Further, ideally it would be possible to abort queued file system writes, to account for
1. configuration is 4 units of memory, 64 units of storage, 4 units of storage-queue
2. source provides 8 units. 4 units are cached in memory, 4 are queued to storage (but not yet written)
3. recipient accepts 4 units from memory. If we can't abort the queued writes we're now blocked until someone else's IO completes, then we write, then we read back (also extending the block on anyone else's IO-bound buffers).

That last element might not want to be in a first pass implementation, but I'll be bearing it in mind while implementing the async file manager.

This initial issue is functioning mostly as an announcement that I am starting work on this. A more complete design for comment will follow once I have an initial prototype working. (But early comments are welcome too!)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filesystem buffering filter extension #19026

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Filesystem buffering filter extension #19026

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions