Skip to content

Primary key index --> deterministic function over fields should be used for equality/IN predicates #82161

@nmiculinic

Description

@nmiculinic

Company or project name

Observability --> logging use-case.

Describe the situation

equality match for fields should use primary key index even if primary key index entry has been derived via deterministic function.

That is:

CREATE TABLE test (
    field_1 type_1
    ...
)
    ENGINE = MergeTree()
    ORDER BY ..., f(field_1, ...), ...

For queries like:

WHERE field_1 = <>

it should use primary key index, under assumption f is deterministic function.

Right now this is needed:

WHERE field_1 = <> AND f(field_1, ...) 

to use primary key index

Which ClickHouse versions are affected?

25.4.3.22

How to reproduce

DROP TABLE IF EXISTS test;
CREATE TABLE test (
    pod String
)
    ENGINE = MergeTree()
    ORDER BY left(pod, length(pod) - length(substringIndex(pod, '-', -1)) - 1);


INSERT INTO test
SELECT
    arrayElement(
            ['vector-abc-001', 'vector-abc-002', 'metrics-def-003', 'metrics-def-004', 'logs-ghi-005', 'logs-ghi-006', 'traces-jkl-007', 'traces-jkl-008', 'events-mno-009', 'events-mno-010'],
            (number % 10) + 1
    )
FROM numbers(100000);

SELECT
    pod,
   left(pod, length(pod) - length(substringIndex(pod, '-', -1)) - 1)
FROM test
GROUP BY 1
LIMIT 10;


-- primary key index is NOT used
EXPLAIN indexes=1
SELECT count()
FROM test
WHERE pod = 'vector-abc-001';

-- primary key not used
EXPLAIN indexes=1
SELECT count()
FROM test
WHERE pod BETWEEN 'vector-abc-001' AND 'vector-abd';

-- primary key index finally used

EXPLAIN indexes=1
SELECT count()
FROM test
WHERE pod = 'vector-abc-001'
    AND left(pod, length(pod) - length(substringIndex(pod, '-', -1)) - 1) = 'vector-abc';

Expected performance

Primary key used for equality/range optimisations where primary key has derived field in index

Additional context

No response

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions