refactor: improvements to last cache implementation by hiltontj · Pull Request #25133 · influxdata/influxdb

hiltontj · 2024-07-09T19:09:51Z

Three commits in this PR address some improvements that were needed to the last cache:

e08c67c - makes cache creation more idempotent, so if a cache with the exact same parameters as an existing cache is created, it will not throw an error, but instead do nothing
9ae4a4f - stores the Instants used for cache value expiration in a single VecDeque for the entire store, instead of alongside every value in every column buffer
b0ccf9e - will have empty branches of the cache hierarchy dropped so that derelict keys do not cause the cache to bloat.

Note: this PR was originally stacked #25125 so it inherited all its commits when that was squashed/merged. This one will get squashed/merged so all commits other than those listed above can be ignored and have no effect on the PR diff.

Each last cache holds a ring buffer for each column in an index map, which preserves the insertion order for faster record batch production. The ring buffer uses a custom type to handle the different supported data types that we can have in the system.

LastCacheProvider is the API used to create last caches and write table batches to them. It uses a two-layer RwLock/HashMap: the first for the database, and the second layer for the table within the database. This allows for table-level locks when writing in buffered data, and only gets a database-level lock when creating a cache (and in future, when removing them as well).

Added basic APIs on the write buffer to access the last cache and then a test to the last_cache module to see that it works with a simple example

Addressed three parts of PR feedback: 1. Remove double-lock on cache map 2. Re-order the get when writing to the cache to be outside the loop 3. Move the time check into the cache itself

This refactors the last cache to use a nested caching structure, where the key columns for a given cache are used to create a hierarchy of nested maps, terminating in the actual store for the values in the cache. Access to the cache is done via a set of predicates which can optionally specify the key column values at any level in the cache hierarchy to only gather record batches from children of that node in the cache. Some todos: - Need to handle the TTL - Need to move the TableProvider impl up to the LastCache type

This re-writes the datafusion TableProvider implementation on the correct type, i.e., the LastCache, and adds conversion from the filter Expr's to the Predicate type for the cache.

Last caches will have expired entries walked when writes come in.

Changed key columns so that they do not accept null values, i.e., rows that are pushed that are missing key column values will be ignored. When producing record batches for a cache, if not all key columns are used in the predicate, then this change makes it so that the non-predicate key columns are produced as columns in the outputted record batches. A test with a few cases showing this was added.

Ensure key columns in the last cache that are not included in the predicate are emitted in the RecordBatches as a column. Cleaned up and added comments to the new test.

Added two tests, as per commit title. Also moved the eviction process to a separate function so that it was not being done on every write to the cache, which could be expensive, and this ensures that entries are evicted regardless of whether writes are coming in or not.

CacheAlreadyExists errors were only being based on the database and table names, and not including the cache names, which was not correct.

This also adds explicit support for series key columns to distinguish them from normal tags in terms of nullability A test was added to check nulls work

Support the addition of new fields to the last cache, for caches that do not have a specified set of value columns. A test was added along with the changes.

Enabling addition of new fields to a last cache made the insertion order guarantee of the IndexMap break down. It could not be relied upon anymore so this commit removes reference to that fact, despite still using the IndexMap type, and strips out the schema from the inner LastCacheStore type of the LastCache. Now, the outer LastCache schema is relied on for producing RecordBatches, which requires a lookup to the inner LastCacheStore's HashMap for each field in the schema. This may not be as convenient as iterating over the map as before, but trying to manage the disparate schema, and maintaining the map ordering was making the code too complicated. This seems like a reasonable compromise for now, until we see the need to optimize. The IndexMap is still used for its fast iteration and lookup characteristics. The test that checks for new field ordering behaviour was modified to be correct.

Some renaming of variables was done to clarify meaning as well.

Last Cache creation is more idempotent, if a cache is created, and then an attempt to create it again with the same parameters is used, it will not result in an error.

The last cache column buffers were storing an instant next to each buffered value, which is unnecessary and not space efficient. This makes it so the LastCacheStore holds a single buffer of Instants and manages TTLs using that.

pauldix

great stuff

pauldix · 2024-07-10T17:25:21Z

influxdb3_write/src/last_cache.rs

    }

+    /// Compare this cache's configuration with that of another
+    fn compare_config(&self, other: &Self) -> Result<(), Error> {


nice descriptive errors 😄

hiltontj added 30 commits June 27, 2024 12:01

test: APIs on write buffer and test for last cache

a602521

Added basic APIs on the write buffer to access the last cache and then a test to the last_cache module to see that it works with a simple example

docs: add some doc comments to last_cache

effc6dc

chore: clippy

5021777

chore: one small comment on IndexMap

978c501

chore: clean up some stale comments

7f68745

refactor: part of PR feedback

28caa5f

Addressed three parts of PR feedback: 1. Remove double-lock on cache map 2. Re-order the get when writing to the cache to be outside the loop 3. Move the time check into the cache itself

refactor: TableProvider impl to LastCache

5607951

This re-writes the datafusion TableProvider implementation on the correct type, i.e., the LastCache, and adds conversion from the filter Expr's to the Predicate type for the cache.

feat: support TTL in last cache

c6deb3b

Last caches will have expired entries walked when writes come in.

refactor: add panic when unexpected predicate used

fd6ea48

refactor: small naming convention change

0865573

fix: last cache key column query output

ee54544

Ensure key columns in the last cache that are not included in the predicate are emitted in the RecordBatches as a column. Cleaned up and added comments to the new test.

chore: clippy and some un-needed code

979f717

fix: clean up some logic errors in last_cache

678bbd0

test: add invalid predicate test cases to last_cache

eb6ed24

test: last_cache with field key columns

543c608

test: last_cache uses series key for default keys

fabb637

test: last_cache uses tag set as default keys

a3ef036

docs: add doc comments to last_cache

1103026

fix: logic error in last cache creation

8182a4c

CacheAlreadyExists errors were only being based on the database and table names, and not including the cache names, which was not correct.

docs: add some comments to last cache create fn

a30a19e

Merge branch 'main' into hiltontj/lastcache-impl

e869049

feat: support null values in last cache

24c0a8b

This also adds explicit support for series key columns to distinguish them from normal tags in terms of nullability A test was added to check nulls work

fix: reset last cache last time when ttl evicts all data

89b5e7c

feat: support last caches that can add new fields

89410d0

feat: support new values in last cache

b0fdd2a

Support the addition of new fields to the last cache, for caches that do not have a specified set of value columns. A test was added along with the changes.

hiltontj added 8 commits July 5, 2024 16:35

chore: clippy

d6118d4

docs: add comments throughout new last cache code

8ad878b

fix: last cache schema merging when new fields added

a129d00

refactor: use has set instead of scanning entire row on each push

b397615

Some renaming of variables was done to clarify meaning as well.

refactor: make cache creation more idempotent

e08c67c

Last Cache creation is more idempotent, if a cache is created, and then an attempt to create it again with the same parameters is used, it will not result in an error.

refactor: only store a single buffer of Instants

9ae4a4f

The last cache column buffers were storing an instant next to each buffered value, which is unnecessary and not space efficient. This makes it so the LastCacheStore holds a single buffer of Instants and manages TTLs using that.

refactor: clean up derelict cache members on eviction

b0ccf9e

hiltontj added the v3 label Jul 9, 2024

hiltontj requested review from mgattozzi and pauldix July 9, 2024 19:09

hiltontj self-assigned this Jul 9, 2024

Base automatically changed from hiltontj/lastcache-new-vals to main July 9, 2024 20:35

Merge branch 'main' into hiltontj/lastcache-impl-pt-2

7df91c0

pauldix approved these changes Jul 10, 2024

View reviewed changes

hiltontj merged commit 0b8fbf4 into main Jul 10, 2024

hiltontj deleted the hiltontj/lastcache-impl-pt-2 branch July 10, 2024 17:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: improvements to last cache implementation#25133

refactor: improvements to last cache implementation#25133
hiltontj merged 39 commits intomainfrom
hiltontj/lastcache-impl-pt-2

hiltontj commented Jul 9, 2024 •

edited

Loading

Uh oh!

pauldix left a comment

Uh oh!

pauldix Jul 10, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hiltontj commented Jul 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pauldix left a comment

Choose a reason for hiding this comment

Uh oh!

pauldix Jul 10, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hiltontj commented Jul 9, 2024 •

edited

Loading