Skip to content

Epic: Last N Value Cache #25091

@hiltontj

Description

@hiltontj

Overview

A Last N Value Cache will allow users to access the last value of many series (either by identifier or group) very quickly (<10ms).

Users should be able to specify for a given table and set of columns, the last N values they want to keep cached in RAM. This will be a feature available in both open source and Pro, but there will be limitations in the former.

For a given table, the user would specify the lookup key (i.e. columns to lookup by), the number of values to cache, and the columns (either by name or *) that they want in the cache. The time of the values will always be included.

Cache Creation

To create a cache, users specify:

  1. Name of the table and database
  2. A name for the cache (default to <table_name>_<key_columns>_last_cache)
  3. Key columns (default to series key, tag set sort order, or tags in lexicographic order)
  4. Number of values to store (default to 1, limit to 10)
  5. Columns to cache (default to all non-key columns)
  6. A Time to Live (TTL) which specifies how long values will live in the cache until they are evicted (default to 4 hours)
  7. (Pro only) How far back to query to load cache on boot-up

We would like the front-end for this to be available via a REST API.

The configuration of each cache will be stored in the catalog.

Populating the Cache

In open source, the cache should be populated as a write through while the server is running. In Pro, this will also be the case, but Pro will also have the ability to fill the cache from historical data on boot-up.

Cache Queries

Querying the cache will require a specialized query. The query syntax could look like so:

SELECT foo, time FROM last_cache('some_table');
SELECT foo, time FROM last_cache('some_table') WHERE cola in ['pepsi', 'coke'];
SELECT foo, time FROM last_cache('some_table') WHERE key_col = 'someval';

This is a use-case for DataFusion's User-Defined Table Functions (UDTF).

In some cases, query predicates may be handled directly by the cache's TableProvider/TableFunctionImpl, while more complicated predicates could just be passed back up to the query engine, but where we draw that line remains TBD.

Other Requirements

  • The key columns, if not specified on creation of the cache, will default to the series key (if present), the user defined sort order (if present), or the tags in lexicographical order.
  • Only string, int, or bool columns can be used as key columns.
  • Values columns in the cache, if not specified on creation, will default to all non-key columns in the table, and newly added fields should be used in the cache when added.
  • (Pro) The last N value cache populate on boot-up can be disabled on start-up of the server for recovery purposes.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions