WIP: Optimize PK lookup for queries that match exact PK range by bobrik · Pull Request #12277 · ClickHouse/ClickHouse

bobrik · 2020-07-08T00:26:38Z

Changelog category:

Performance Improvement

Changelog entry:

Optimize PK lookup for queries that match exact PK range.

Detailed description / Documentation draft:

Existing code that looks up marks that match the query has a pathological case, when most of the part does in fact match the query. The code works by recursively splitting a part into ranges and then discarding the ranges that definitely do not match the query, based on primary key.

The problem is that it requires visiting every mark that matches the query, making the complexity of this sort of look up O(n).

For queries that match exact range on the primary key, we can find both left and right parts of the range with O(log 2) complexity.

This change implements exactly that.

To engage this optimization, the query must:

Have a prefix list of the primary key.
Have only range or single set element constraints for columns.
Have only AND as a boolean operator.

Consider a table with (service, timestamp) as the primary key.

The following conditions will be optimized:

service = 'foo'
service = 'foo' and timestamp >= now() - 3600
service in ('foo')
service in ('foo') and timestamp >= now() - 3600 and timestamp <= now

The following will fall back to previous lookup algorithm:

timestamp >= now() - 3600
service in ('foo', 'bar') and timestamp >= now() - 3600
service = 'foo'

Note that the optimization won't engage when PK has a range expression followed by a point expression, since in that case the range is not continuous.

Trace query logging provides the following messages types of messages, each representing a different kind of PK usage for a part:

Used optimized inclusion search over index for part 20200711_5710108_5710108_0 with 9 steps
Used generic exclusion search over index for part 20200711_5710118_5710228_5 with 1495 steps
Not using index on part 20200710_5710473_5710473_0

Number of steps translates to computational complexity.

Here's a comparison for before and after for a query over 24h of data:

Read 4562944 rows, 148.05 MiB in 45.19249672 sec.,   100966 rows/sec.,   3.28 MiB/sec.
Read 4183040 rows, 135.78 MiB in 0.196279627 sec., 21311636 rows/sec., 691.75 MiB/sec.

This is especially useful for queries that read data in order
and terminate early to return "last X things" matching a query.

See #11564 for more thoughts on this.

CurtizJ

Looks ok.
BTW, is it possible to support case, when continuos prefix of primary key is in WHERE clause, but not only when there is whole primary key?
Also 2 tests are failing.

bobrik · 2020-07-09T21:13:27Z

My first version was to allow a prefix of a PK to engage this. E.g. if PK is (a, b, c), then the following is allowed:

(a, b, c)
(a, b)
(a)

But the following isn't:

(a, c)
(b, c)
(b)
(c)

Is that what you mean? If so, I can totally make it happen.

CurtizJ · 2020-07-09T22:55:53Z

Yes, I meant that.

bobrik · 2020-07-11T02:34:34Z

I've added support for PK prefix matches and reorganized code a bit to increase code reuse.

Tests should be happy now, the problem was twofold:

Missing support for monotonic functions — I don't fully understand how to make it work, made it fall back for now.
Missing pass-by-reference to may_be_true_in_range() and final marks, this is fixed.

I've also added some trace logging with amount of matching steps performed, which I intend to add some tests on. Perhaps it can be integrated with EXPLAIN, but I think that should be out of scope of this PR.

Let me know what you think.

Existing code that looks up marks that match the query has a pathological case, when most of the part does in fact match the query. The code works by recursively splitting a part into ranges and then discarding the ranges that definitely do not match the query, based on primary key. The problem is that it requires visiting every mark that matches the query, making the complexity of this sort of look up O(n). For queries that match exact range on the primary key, we can find both left and right parts of the range with O(log 2) complexity. This change implements exactly that. To engage this optimization, the query must: * Have a prefix list of the primary key. * Have only range or single set element constraints for columns. * Have only AND as a boolean operator. Consider a table with `(service, timestamp)` as the primary key. The following conditions will be optimized: * `service = 'foo'` * `service = 'foo' and timestamp >= now() - 3600` * `service in ('foo')` * `service in ('foo') and timestamp >= now() - 3600 and timestamp <= now` The following will fall back to previous lookup algorithm: * `timestamp >= now() - 3600` * `service in ('foo', 'bar') and timestamp >= now() - 3600` * `service = 'foo'` Note that the optimization won't engage when PK has a range expression followed by a point expression, since in that case the range is not continuous. Trace query logging provides the following messages types of messages, each representing a different kind of PK usage for a part: ``` Used optimized inclusion search over index for part 20200711_5710108_5710108_0 with 9 steps Used generic exclusion search over index for part 20200711_5710118_5710228_5 with 1495 steps Not using index on part 20200710_5710473_5710473_0 ``` Number of steps translates to computational complexity. Here's a comparison for before and after for a query over 24h of data: ``` Read 4562944 rows, 148.05 MiB in 45.19249672 sec., 100966 rows/sec., 3.28 MiB/sec. Read 4183040 rows, 135.78 MiB in 0.196279627 sec., 21311636 rows/sec., 691.75 MiB/sec. ``` This is especially useful for queries that read data in order and terminate early to return "last X things" matching a query. See ClickHouse#11564 for more thoughts on this.

Conditions that are outside of PK are marked as `unknown` in `KeyCondition`, so it's safe to allow them, as long as they are always combined by `AND`.

This runs PK lookup and skipping index stages on parts in parallel, as described in ClickHouse#11564. While ClickHouse#12277 sped up PK lookups, skipping index stage may still be a bottleneck in a select query. Here we parallelize both stages between parts. On a query that uses a bloom filter skipping index to pick 2,688 rows out of 8,273,114,994 on a two day time span, this change reduces latency from 10.5s to 1.5s.

alexey-milovidov · 2020-08-27T17:03:39Z

#13248

canhld94 · 2022-01-12T08:35:17Z

Hi @bobrik , sorry for reopening this pr after a long time. This is a nice optimisation, however, I'm curious why it doesn't apply when key condition has monotonic function chain?
This is quite useful, for example a table that have primary key is dateTime and the query with is like SELECT * FROM TABLE WHEN toYear(key) > 2019 AND toYear(key) < 2021. In this case, the key condition should be continuous, but the mark range lookup will fallback to exclusive search because of function toYear.
From my point of view, I cannot figure out why monotonic functions can affect the continuous of the key range.
Thanks!

bobrik · 2022-01-13T02:01:19Z

I don't remember what wasn't working with monotonic function chains. You are welcome to implement it.

robot-clickhouse added the pr-performance Pull request with some performance improvements label Jul 8, 2020

bobrik mentioned this pull request Jul 8, 2020

DB::KeyCondition::mayBeTrueInRange() is a bottleneck in a PK lookup #11564

Closed

alexey-milovidov added the can be tested label Jul 8, 2020

bobrik force-pushed the ivan/exact-range-speedup branch from 30f8244 to 2436956 Compare July 8, 2020 00:28

alexey-milovidov requested a review from CurtizJ July 8, 2020 00:50

CurtizJ reviewed Jul 9, 2020

View reviewed changes

bobrik force-pushed the ivan/exact-range-speedup branch 2 times, most recently from 060ac97 to f22b382 Compare July 11, 2020 02:33

bobrik force-pushed the ivan/exact-range-speedup branch 3 times, most recently from ffc62b4 to 23b76a3 Compare July 11, 2020 19:26

bobrik force-pushed the ivan/exact-range-speedup branch from 23b76a3 to d9d8d02 Compare July 11, 2020 19:27

Allow conditions outside of PK with exact range

8784994

Conditions that are outside of PK are marked as `unknown` in `KeyCondition`, so it's safe to allow them, as long as they are always combined by `AND`.

CurtizJ self-assigned this Jul 13, 2020

CurtizJ approved these changes Jul 15, 2020

View reviewed changes

CurtizJ merged commit 97e8a88 into ClickHouse:master Jul 16, 2020

bobrik mentioned this pull request Jul 20, 2020

Parallelize PK range and skipping index stages #12589

Merged

This was referenced Aug 28, 2020

Assertion "Cannot read out of marks range" #13248

Closed

Fix bug in mark inclusion search. #14225

Merged

filimonov mentioned this pull request Feb 12, 2021

Noisy logging from (SelectExecutor): Not using primary index on part XXXX #20381

Closed

canhld94 mentioned this pull request Feb 1, 2023

For enum columns in the primary key, the granule selection algorithm shouldn't depend on if string or numeric value is used in predicate #45536

Closed

bobrik deleted the ivan/exact-range-speedup branch December 5, 2025 06:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Optimize PK lookup for queries that match exact PK range#12277

WIP: Optimize PK lookup for queries that match exact PK range#12277
CurtizJ merged 2 commits intoClickHouse:masterfrom
bobrik:ivan/exact-range-speedup

bobrik commented Jul 8, 2020 •

edited by CurtizJ

Loading

Uh oh!

CurtizJ left a comment

Uh oh!

bobrik commented Jul 9, 2020

Uh oh!

CurtizJ commented Jul 9, 2020

Uh oh!

bobrik commented Jul 11, 2020

Uh oh!

alexey-milovidov commented Aug 27, 2020

Uh oh!

canhld94 commented Jan 12, 2022

Uh oh!

bobrik commented Jan 13, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

bobrik commented Jul 8, 2020 • edited by CurtizJ Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Detailed description / Documentation draft:

Uh oh!

CurtizJ left a comment

Choose a reason for hiding this comment

Uh oh!

bobrik commented Jul 9, 2020

Uh oh!

CurtizJ commented Jul 9, 2020

Uh oh!

bobrik commented Jul 11, 2020

Uh oh!

alexey-milovidov commented Aug 27, 2020

Uh oh!

canhld94 commented Jan 12, 2022

Uh oh!

bobrik commented Jan 13, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

bobrik commented Jul 8, 2020 •

edited by CurtizJ

Loading