Skip to content

Size-capped tables #37548

@alexey-milovidov

Description

@alexey-milovidov

Use case

Logs and events.

Describe the solution you'd like

Four new table-level settings for MergeTree tables:

  • min_bytes_to_keep;
  • max_bytes_to_keep;
  • min_rows_to_keep;
  • max_rows_to_keep;

The oldest parts (by insertion time, a.k.a min block number) will be deleted if the total table size is more than max_..._to_keep and will be not less than min_..._to_keep after deletion.

Caveats

We cut the table not precisely by size but only by dropping the oldest parts, as it is cheap.
For large tables it may lead to a variation of up to 150 GB in size in worst case.

Can be solved with some tuning (maybe even automatic) of max parts size to merge.

Cutting off the excessive records can also be done on merging of the oldest parts. It can be done precisely for the number of rows and by proration for the number of bytes.

Additional Context

This is for simple cases by removing the oldest inserted data.
Note that for removing the data by some values (e.g. datetime field), and even for out-of-order data, is already possible with TTL.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featurewarmup taskThe task for new ClickHouse team members. Low risk, moderate complexity, no urgency.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions