Skip to content

Improvements for projections #74234

@alexey-milovidov

Description

@alexey-milovidov

Support for projections with where

Let's suppose you have a projection with WHERE condition:

CREATE TABLE events
(
    time DateTime,
    event_type String,
    message String,
    ...
    
    PROJECTION proj
    (
        SELECT time, message
        WHERE event_type = 'pageview'
    )
)
ENGINE = MergeTree ORDER BY time

If a query uses a WHERE condition which guarantees that the projection's WHERE condition is already satisfied - it uses either the same condition or a chain of AND operators, containing it - then this query should use the projection.

If there are many projections suitable for a query, it should use the one with fewer estimated data to read.

Support for compression and secondary indices

We want to support specifying arbitrary compression codecs and additional indices in projection.
For this purpose, let's introduce a syntax with explicit structure of projections' columns:

CREATE TABLE events
(
    time DateTime,
    event_type String,
    message String,
    size UInt64,
    ...
    
    PROJECTION proj
    (
        date Date,
        message String CODEC(ZSTD(3)),
        INDEX ix (size) TYPE minmax,
    )
    AS
    (
        SELECT time::Date AS date, message
        GROUP BY date
        ORDER BY date
    )
)
ENGINE = MergeTree ORDER BY time

Here the part

    (
        date Date,
        message String CODEC(ZSTD(3)),
        INDEX ix (size) TYPE minmax,
    )

is an extension of the existing syntax.

Support for ARRAY JOIN expressions

If a projection contains ARRAY JOIN or arrayJoin, it will be applicable for queries, containing exactly the same array join.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions