Skip to content

re_datastore: introduce clustering keys/components #559

@teh-cmc

Description

@teh-cmc

The idea is to have a notion of "primary" component not only when querying, but also when inserting.
When inserting, this "primary" component (or rather, "clustering key", in a more Cassandra-like parlance), controls how the data is sorted within a row, e.g. by instance id) before insertion.

This opens up the way to optimizations when joining results in the query processor.

It's also our opportunity to return an error if:

  • the different columns have different lengths
  • the payload being inserted doesn't contain the sorting key

This effectively drastically reduces how permissive the datastore is, which in turn will make it much easier to reason about and test (e.g. you can always convert all query results into dataframes and join them on the clustering key, it has to be there).

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions