Primary key analysis for _part_offset by amosbird · Pull Request #58224 · ClickHouse/ClickHouse

amosbird · 2023-12-26T09:01:10Z

Changelog category (leave one):

Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

The primary key analysis in MergeTree tables will now be applied to predicates that include the virtual column _part_offset (optionally with _part). This feature can serve as a poor man's secondary index.

robot-ch-test-poll2 · 2023-12-26T09:03:36Z

This is an automated comment for commit 66660ee with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Successful checks

Check name	Description	Status
AST fuzzer	Runs randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help	✅ success
ClickBench	Runs [ClickBench](https://github.com/ClickHouse/ClickBench/) with instant-attach table	✅ success
ClickHouse build check	Builds ClickHouse in various configurations for use in further steps. You have to fix the builds that fail. Build logs often has enough information to fix the error, but you might have to reproduce the failure locally. The cmake options can be found in the build log, grepping for cmake. Use these options and follow the general build process	✅ success
Compatibility check	Checks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help	✅ success
Docker image for servers	The check to build and optionally push the mentioned image to docker hub	✅ success
Docs check	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
Fast test	Normally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here	✅ success
Flaky tests	Checks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integrational tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc	✅ success
Install packages	Checks that the built packages are installable in a clear environment	✅ success
Integration tests	The integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests	✅ success
Mergeable Check	Checks if all other necessary checks are successful	✅ success
Performance Comparison	Measure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests	✅ success
SQLTest	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
SQLancer	Fuzzing tests that detect logical bugs with SQLancer tool	✅ success
Sqllogic	Run clickhouse on the sqllogic test set against sqlite and checks that all statements are passed	✅ success
Stateful tests	Runs stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	✅ success
Stateless tests	Runs stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	✅ success
Stress test	Runs stateless functional tests concurrently from several clients to detect concurrency-related errors	✅ success
Style Check	Runs a set of checks to keep the code style clean. If some of tests failed, see the related log from the report	✅ success
Unit tests	Runs the unit tests for different release types	✅ success

Check name	Description	Status
CI running	A meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR	⏳ pending
Upgrade check	Runs stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts	❌ failure

alexey-milovidov · 2023-12-26T13:51:54Z

Nice.

EmeraldShift · 2025-02-12T13:24:51Z

Is it possible to use this feature directly as a secondary index, with or without projections? I don't see any documentation for these virtual columns, but it sounds like they might be useful for making some queries faster.

azat · 2025-02-14T19:37:36Z

Is it possible to use this feature directly as a secondary index

You don't need any secondary index for this, it works just out-of-the box.

I don't see any documentation for these virtual columns, but it sounds like they might be useful for making some queries faster.

Yes, if you know the exact offset in the part you can optimize the query, but, ClickHouse already applies such optimizations internally based on primary and secondary indexes, you don't need to do this explicitly

EmeraldShift · 2025-03-27T12:50:52Z

if you know the exact offset in the part you can optimize the query, but, ClickHouse already applies such optimizations internally based on primary and secondary indexes, you don't need to do this explicitly

Can it be made possible to store _part_offset in a projection, too? I'm finding that secondary index analysis is too slow for a really big table (>100B rows) with a bloom filter index on a very high cardinality, very sparse column (trace_id). I imagine it would be asymptotically much faster if there were a projection like

SELECT trace_id, _part_offset ORDER BY trace_id

which could recalculate on merges and enable transforming a query like

SELECT * WHERE trace_id = 'abc'

into

SELECT * FROM table WHERE trace_id = 'abc' AND indexHint((_part, _part_offset) IN (SELECT (_part, _part_offset) FROM projection WHERE trace_id = 'abc'))

(using the same table snapshot for the inner select)

Is such a thing doable? I think there are a few things that need to work

Referring to _part_offset in a projection (I get a "Missing columns" error)
The projection must recompute the offset column on each merge, because the offset changes
Either
- The above query can be written using FROM table everywhere, and it'll use the projection for the inner select automatically because the projection reads the fewest rows
- The query analyzer can rewrite the simpler version of the query to the expanded version
The same table snapshot is used for the inner select query so a race with part merging won't make the results inconsistent
- This might be much easier if the query analyzer does the expanding itself

Am I making any sense? My hope is that rather than doing an O(n) scan through secondary index marks (even if the constant factor is low due to high GRANULARITY), we can do an O(log n) binary search in the projection for k matching part offsets, then a O(k log n) binary search for those offsets in the main table. In my testing with primary key instead of _part_offset, this is MUCH faster in practice. And the advantage of this proposal is that it doesn't depend on your main table's primary key.

amosbird · 2025-03-27T13:50:06Z

@EmeraldShift Oh this is a well-timed FR. I'm currently finalizing plans to contribute this feature in the coming days.

EmeraldShift · 2025-03-27T18:28:35Z

Wonderful! I look forward to reading your plans, and I'm happy to experiment with it and see how it fits in with my use case!

robot-ch-test-poll2 added the pr-improvement Pull request with some product improvements label Dec 26, 2023

amosbird added 2 commits December 26, 2023 17:03

Primary key analysis for _part_offset

bfcccf9

Add comment

66660ee

amosbird force-pushed the part_offset_pk branch from d15e78b to 66660ee Compare December 26, 2023 09:04

UnamedRus mentioned this pull request Dec 26, 2023

Virtual column for row / granule number in part and index scan by it #30530

Closed

Update 02950_part_offset_as_primary_key.sql

4fbe41b

alexey-milovidov self-assigned this Dec 26, 2023

alexey-milovidov merged commit a0fccb0 into ClickHouse:master Dec 26, 2023

amosbird mentioned this pull request Dec 26, 2023

WHERE condtions optimization leveraging projection feature #57216

Closed

1 task

amosbird mentioned this pull request Mar 30, 2025

Projection Index Step 1: Support _part_offset in normal projections #78429

Merged

1 task

amosbird mentioned this pull request Apr 22, 2025

Add _part_starting_offset virtual column and key condition support for offset-based querying #79417

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Primary key analysis for _part_offset#58224

Primary key analysis for _part_offset#58224
alexey-milovidov merged 3 commits intoClickHouse:masterfrom
amosbird:part_offset_pk

amosbird commented Dec 26, 2023

Uh oh!

robot-ch-test-poll2 commented Dec 26, 2023 •

edited by robot-ch-test-poll

Loading

Uh oh!

alexey-milovidov commented Dec 26, 2023

Uh oh!

EmeraldShift commented Feb 12, 2025

Uh oh!

azat commented Feb 14, 2025

Uh oh!

EmeraldShift commented Mar 27, 2025

Uh oh!

amosbird commented Mar 27, 2025

Uh oh!

EmeraldShift commented Mar 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

amosbird commented Dec 26, 2023

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Uh oh!

robot-ch-test-poll2 commented Dec 26, 2023 • edited by robot-ch-test-poll Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexey-milovidov commented Dec 26, 2023

Uh oh!

EmeraldShift commented Feb 12, 2025

Uh oh!

azat commented Feb 14, 2025

Uh oh!

EmeraldShift commented Mar 27, 2025

Uh oh!

amosbird commented Mar 27, 2025

Uh oh!

EmeraldShift commented Mar 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

robot-ch-test-poll2 commented Dec 26, 2023 •

edited by robot-ch-test-poll

Loading