Slow queries when reading in order

**Describe the situation**
We have a certain data-set with user logs, trying to get all data by certain tags spanning several hours.
Queries with ordering by part of ordering keys work slower than queries which don't use ordering by order-key

**How to reproduce**
Tried on ClickHouse 22.3 initially, same behavior on `22.7.4.16` too

CREATE statement:
```
CREATE TABLE logs_time_dt
(
    `time` DateTime64(9) Codec(Delta, ZSTD(7)),
    `project` LowCardinality(String) CODEC(ZSTD(7)),
    `service` LowCardinality(String) CODEC(ZSTD(7)),
    `message` String CODEC(ZSTD(7)),
    `tags_hash` Array(UInt64) CODEC(ZSTD(7)),
    INDEX idx_message message TYPE ngrambf_v1(3, 512, 2, 0) GRANULARITY 3,
    INDEX idx_tags_hash tags_hash TYPE bloom_filter(0.01) GRANULARITY 1
)
ENGINE = MergeTree
PARTITION BY toStartOfHour(time)
ORDER BY (project, service, time)
SETTINGS index_granularity = 1024
```

example dataset, several hours worth of data, the result set by particular tag is 20k rows:
```
insert into logs_time_dt
(time, project, service, message, tags_hash)
select
fromUnixTimestamp64Nano(toInt64(toUnixTimestamp64Nano(toDateTime64('2022-08-01',9))+number/(2777)*1e9)),
'test' as project,
'test' as service,
'foo',
[ number % 3000 ]
from system.numbers
limit 60*1e6
```

slow query reading by key in reverse order:
```
SELECT *
FROM logs_time_dt
WHERE (project = 'test') AND (service = 'test') AND has(tags_hash, 42)
ORDER BY time DESC
FORMAT `Null`

Query id: f3277eec-3a99-4442-849d-1209ad402548

Ok.

0 rows in set. Elapsed: 8.770 sec. Processed 29.27 million rows, 1.11 GB (3.34 million rows/s., 126.91 MB/s.)
```

reading in order seems to be faster in this particular example, but still slow:
```
SELECT *
FROM logs_time_dt
WHERE (project = 'test') AND (service = 'test') AND has(tags_hash, 42)
ORDER BY time ASC
FORMAT `Null`

Query id: a3ce44be-5ea2-4ffc-9c72-7f4fc656a5e1

Ok.

0 rows in set. Elapsed: 3.788 sec. Processed 29.27 million rows, 1.11 GB (7.73 million rows/s., 293.59 MB/s.)
```

sorting by negative timestamp works a lot faster with this case:
```
SELECT *
FROM logs_time_dt
WHERE (project = 'test') AND (service = 'test') AND has(tags_hash, 42)
ORDER BY -toUnixTimestamp64Nano(time) ASC
FORMAT `Null`

Query id: e5415b8d-556c-4ffa-910a-81902b252f55

Ok.

0 rows in set. Elapsed: 0.442 sec. Processed 29.27 million rows, 1.11 GB (66.28 million rows/s., 2.52 GB/s.)
```



**Expected performance**
I expect the "correct" query with ordering by primary key suffix wouldn't be slower than ordering by a computed expression.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow queries when reading in order #40583

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Slow queries when reading in order #40583

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions