[RFC] Use global `LowCardinality` dictionary for optimizations if it is small enough

**Use case**

Optimization of aggregation and JOINs over `LowCardinality` columns that have low number of unique values. In these cases `LowCardinality` column usually can be replaced with `Enum` but it is less convinient since it requires to change schema every time when set of possible values changes. 

**Describe the solution you'd like**

- Build global dictionary for `LowCardinality` columns which are suitable for optimization (are in `GROUP BY` key or in `ON` section of `JOIN`) up to a certain size (refuse optimization if the dictionary becomes large). It will require reading dictionaries on a new stage of query execution: after filtering parts by primary key and before pipeline execution is started. Dictionaries can be pushed down and reused. Also global dictionary can be cached in `MergeTreeData`. 
- Pushdown the global dictionary to `LowCardinality` serializations in data parts. Encode positions of `LowCardinality` columns with new dictionary and set shared dictionary to them.
- Use positions in dictionary as keys for hash table in aggregation or in JOIN. It will allow to choose more optimal hash method: 
     - method with single numeric key (often `UInt8` which has its own optimization of aggregation) instead of specialized `LowCardinality` method in case of one column
     - method with fixed numeric keys in case of aggregation by `LowCardinality` and numeric columns



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Use global `LowCardinality` dictionary for optimizations if it is small enough #72717

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Use global LowCardinality dictionary for optimizations if it is small enough #72717

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[RFC] Use global `LowCardinality` dictionary for optimizations if it is small enough #72717