-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Performance becomes worse for queries with LIMIT clause after #64607 #66578
Copy link
Copy link
Open
Labels
Description
For the same example
CREATE TABLE logs_time_dt
(
`time` DateTime64(9) Codec(Delta, ZSTD(7)),
`project` LowCardinality(String) CODEC(ZSTD(7)),
`service` LowCardinality(String) CODEC(ZSTD(7)),
`message` String CODEC(ZSTD(7)),
`tags_hash` Array(UInt64) CODEC(ZSTD(7)),
INDEX idx_message message TYPE ngrambf_v1(3, 512, 2, 0) GRANULARITY 3,
INDEX idx_tags_hash tags_hash TYPE bloom_filter(0.01) GRANULARITY 1
)
ENGINE = MergeTree
PARTITION BY toStartOfHour(time)
ORDER BY (project, service, time)
SETTINGS index_granularity = 1024;
insert into logs_time_dt
(time, project, service, message, tags_hash)
select
fromUnixTimestamp64Nano(toInt64(toUnixTimestamp64Nano(toDateTime64('2022-08-01',9))+number/(2777)*1e9)),
'test' as project,
'test' as service,
'foo',
[ number % 3000 ]
from system.numbers
limit 60*1e6;If the LIMIT claus is used in the query, the performance is worse than before.
set read_in_order_use_buffering=0;
SET read_in_order_use_buffering = 0
Query id: 417916b3-ce57-4c75-871a-cc8b0b583812
Ok.
0 rows in set. Elapsed: 0.001 sec.
ip-10-169 :) SELECT *
FROM logs_time_dt
WHERE (project = 'test') AND (service = 'test') AND has(tags_hash, 42)
ORDER BY time desc
limit 100 FORMAT `Null`; //<-----------------LIMIT clause is added
SELECT *
FROM logs_time_dt
WHERE (project = 'test') AND (service = 'test') AND has(tags_hash, 42)
ORDER BY time DESC
LIMIT 100
FORMAT `Null`
Query id: 7e7c48b5-7190-4346-922b-30e5bba179c8
Ok.
0 rows in set. Elapsed: 0.087 sec. Processed 306.23 thousand rows, 7.53 MB (3.51 million rows/s., 86.24 MB/s.)
Peak memory usage: 24.93 MiB.And when the optimization is turned on(which is by default now)
set read_in_order_use_buffering=1;
SET read_in_order_use_buffering = 1
Query id: 7a50f132-6198-4098-a2e1-9afc39e6d3d1
Ok.
0 rows in set. Elapsed: 0.001 sec.
ip-10 :) SELECT *
FROM logs_time_dt
WHERE (project = 'test') AND (service = 'test') AND has(tags_hash, 42)
ORDER BY time desc
limit 100 FORMAT `Null`;
SELECT *
FROM logs_time_dt
WHERE (project = 'test') AND (service = 'test') AND has(tags_hash, 42)
ORDER BY time DESC
LIMIT 100
FORMAT `Null`
Query id: 96ed4b86-2398-4678-b293-8d08910e70b4
Ok.
0 rows in set. Elapsed: 0.154 sec. Processed 5.27 million rows, 135.31 MB (34.24 million rows/s., 878.31 MB/s.)
Peak memory usage: 58.70 MiB.For simplicity:
BEFORE: Elapsed: 0.087 sec. Processed 306.23 thousand rows, 7.53 MB (3.51 million rows/s., 86.24 MB/s.)
Peak memory usage: 24.93 MiB.
AFTER: Elapsed: 0.154 sec. Processed 5.27 million rows, 135.31 MB (34.24 million rows/s., 878.31 MB/s.)
Peak memory usage: 58.70 MiB.
Reactions are currently unavailable