Use buffering while reading in order in queries with `WHERE` by CurtizJ · Pull Request #64607 · ClickHouse/ClickHouse

CurtizJ · 2024-05-29T22:45:48Z

Changelog category (leave one):

Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Optimized queries with ORDER BY primary key and WHERE that have a condition with high selectivity by using of buffering. It is controlled by setting read_in_order_use_buffering (enabled by default) and can increase memory usage of query.

Resolves #40583
Resolves #40675
Resolves #11482
Resolves #17364

CI Settings (Only check the boxes if you know what you are doing):

Allow: All Required Checks
Allow: Stateless tests
Allow: Stateful tests
Allow: Integration Tests
Allow: Performance tests
Allow: All Builds
Allow: batch 1, 2 for multi-batch jobs
Allow: batch 3, 4, 5, 6 for multi-batch jobs

Exclude: Style check
Exclude: Fast test
Exclude: All with ASAN
Exclude: All with TSAN, MSAN, UBSAN, Coverage
Exclude: All with aarch64, release, debug

robot-ch-test-poll3 · 2024-05-29T22:49:08Z

This is an automated comment for commit a072cd2 with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check name	Description	Status
Integration tests	The integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests	❌ failure
Performance Comparison	Measure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests	❌ failure
Stateless tests	Runs stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	❌ failure

Successful checks

Check name	Description	Status
AST fuzzer	Runs randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help	✅ success
Builds	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
ClickBench	Runs [ClickBench](https://github.com/ClickHouse/ClickBench/) with instant-attach table	✅ success
Compatibility check	Checks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help	✅ success
Docker keeper image	The check to build and optionally push the mentioned image to docker hub	✅ success
Docker server image	The check to build and optionally push the mentioned image to docker hub	✅ success
Docs check	Builds and tests the documentation	✅ success
Fast test	Normally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here	✅ success
Flaky tests	Checks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integration tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc	✅ success
Install packages	Checks that the built packages are installable in a clear environment	✅ success
Stateful tests	Runs stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	✅ success
Stress test	Runs stateless functional tests concurrently from several clients to detect concurrency-related errors	✅ success
Style check	Runs a set of checks to keep the code style clean. If some of tests failed, see the related log from the report	✅ success
Unit tests	Runs the unit tests for different release types	✅ success
Upgrade check	Runs stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts	✅ success

CurtizJ · 2024-05-30T16:48:09Z

On example from #40583:

CREATE TABLE logs_time_dt
(
    `time` DateTime64(9) Codec(Delta, ZSTD(7)),
    `project` LowCardinality(String) CODEC(ZSTD(7)),
    `service` LowCardinality(String) CODEC(ZSTD(7)),
    `message` String CODEC(ZSTD(7)),
    `tags_hash` Array(UInt64) CODEC(ZSTD(7)),
    INDEX idx_message message TYPE ngrambf_v1(3, 512, 2, 0) GRANULARITY 3,
    INDEX idx_tags_hash tags_hash TYPE bloom_filter(0.01) GRANULARITY 1
)
ENGINE = MergeTree
PARTITION BY toStartOfHour(time)
ORDER BY (project, service, time)
SETTINGS index_granularity = 1024;

insert into logs_time_dt
(time, project, service, message, tags_hash)
select
fromUnixTimestamp64Nano(toInt64(toUnixTimestamp64Nano(toDateTime64('2022-08-01',9))+number/(2777)*1e9)),
'test' as project,
'test' as service,
'foo',
[ number % 3000 ]
from system.numbers
limit 60*1e6;

Before:

SELECT *
FROM logs_time_dt
WHERE (project = 'test') AND (service = 'test') AND has(tags_hash, 42)
ORDER BY time ASC
FORMAT `Null`

0 rows in set. Elapsed: 1.636 sec. Processed 29.26 million rows, 696.98 MB (17.88 million rows/s., 425.92 MB/s.)
Peak memory usage: 32.66 MiB.

After:

SELECT *
FROM logs_time_dt
WHERE (project = 'test') AND (service = 'test') AND has(tags_hash, 42)
ORDER BY time ASC
FORMAT `Null`;

0 rows in set. Elapsed: 0.205 sec. Processed 29.26 million rows, 696.98 MB (142.46 million rows/s., 3.39 GB/s.)
Peak memory usage: 30.39 MiB.

CurtizJ · 2024-06-12T14:47:01Z

Perf tests:

tests/queries/0_stateless/03168_read_in_order_buffering_2.sql

nickitat · 2024-06-12T23:02:47Z

src/Processors/QueryPlan/BufferChunksTransform.h

+class BufferChunksTransform : public IProcessor
+{
+public:
+    BufferChunksTransform(const Block & header_, size_t max_bytes_to_buffer_, size_t limit_);


it seems to me that if this optimisation will only be based on the number of bytes we will miss a lot of cases when rows are more or less wide. I think we always can buffer up to a full block for example. and that would be a good lower limit.
wdyt?

Yes, probably it makes sense. However in current implementation we always buffer at least one chunk because we can exceed the threshold.

I meant 65K rows regardless of their size in bytes, because afaiu we assume individual chunks to be much smaller in terms of number of rows

nickitat · 2024-07-02T18:44:45Z

src/Processors/QueryPlan/BufferChunksTransform.cpp

+        else if (input.isFinished())
+        {
+            output.finish();
+            return Status::Finished;
+        }


I'm not sure if any problem could actually happen with the current implementation, but just for a piece of mind I'd move this if outside of the if (output.canPush()) condition just right on the l.28

Ok, probably makes sense.

CurtizJ · 2024-07-08T23:24:43Z

00504_mergetree_arrays_rw.sql: #66248

CurtizJ · 2024-07-09T10:11:56Z

Perf tests: replaceRegexp_fallback - #66185 (comment)
test_zookeeper_config_load_balancing is unrelated.

alexey-milovidov · 2024-07-30T11:22:36Z

src/Processors/QueryPlan/BufferChunksTransform.h

+{
+
+/// Transform that buffers chunks from the input
+/// up to the certain limit  and pushes chunks to


Double whitespace.

FrankChen021 · 2024-08-01T02:09:17Z

Linked to this performance regression #66578 for anyone who care about this improvement.

use buffering before merging sorted

26fce1f

robot-ch-test-poll3 added the pr-performance Pull request with some performance improvements label May 29, 2024

CurtizJ added 2 commits May 30, 2024 12:46

use buffering before merging sorted

e561f5c

add setting to changes

71cf78a

nickitat self-assigned this May 30, 2024

add tests

6a8bd46

CurtizJ force-pushed the buffer-read-in-order branch from 48d0bb3 to 6a8bd46 Compare May 31, 2024 15:54

CurtizJ added 4 commits June 3, 2024 14:43

fix test

410c19d

fix test

b17feef

Merge remote-tracking branch 'upstream/master' into HEAD

903b281

Merge branch 'master' into buffer-read-in-order

35bf0f9

CurtizJ marked this pull request as ready for review June 7, 2024 14:42

CurtizJ added 2 commits June 11, 2024 13:12

Merge remote-tracking branch 'upstream/master' into HEAD

663b462

fix test

ada56bd

CurtizJ requested review from nickitat June 12, 2024 14:42

nickitat reviewed Jun 12, 2024

View reviewed changes

CurtizJ added 2 commits June 17, 2024 19:37

use rows threshold

c5fdc87

Merge remote-tracking branch 'upstream/master' into HEAD

c8be63a

CurtizJ force-pushed the buffer-read-in-order branch from 872a3f5 to c8be63a Compare June 17, 2024 19:38

CurtizJ added 2 commits June 17, 2024 20:44

better thresholds

148a3c8

Merge remote-tracking branch 'upstream/master' into HEAD

7c1d4a1

nickitat approved these changes Jul 2, 2024

View reviewed changes

CurtizJ added 2 commits July 8, 2024 10:00

Merge remote-tracking branch 'upstream/master' into HEAD

1921a2d

slightly better

9c071fc

CurtizJ force-pushed the buffer-read-in-order branch from f308b1b to 9c071fc Compare July 8, 2024 19:28

nickitat approved these changes Jul 8, 2024

View reviewed changes

Merge remote-tracking branch 'upstream/master' into HEAD

a072cd2

CurtizJ added this pull request to the merge queue Jul 9, 2024

Merged via the queue into ClickHouse:master with commit e29635d Jul 9, 2024

CurtizJ deleted the buffer-read-in-order branch July 9, 2024 11:01

robot-ch-test-poll1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jul 9, 2024

baibaichen mentioned this pull request Jul 9, 2024

[GLUTEN-1632][CH]Daily Update Clickhouse Version (20240710) apache/gluten#6383

Merged

alexey-milovidov reviewed Jul 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use buffering while reading in order in queries with `WHERE`#64607

Use buffering while reading in order in queries with `WHERE`#64607
CurtizJ merged 17 commits intoClickHouse:masterfrom
CurtizJ:buffer-read-in-order

CurtizJ commented May 29, 2024 •

edited

Loading

Uh oh!

robot-ch-test-poll3 commented May 29, 2024 •

edited by robot-clickhouse-ci-2

Loading

Uh oh!

CurtizJ commented May 30, 2024

Uh oh!

CurtizJ commented Jun 12, 2024

Uh oh!

Uh oh!

nickitat Jun 12, 2024

Uh oh!

CurtizJ Jun 17, 2024

Uh oh!

nickitat Jun 17, 2024

Uh oh!

nickitat Jul 2, 2024

Uh oh!

CurtizJ Jul 8, 2024

Uh oh!

CurtizJ commented Jul 8, 2024

Uh oh!

CurtizJ commented Jul 9, 2024

Uh oh!

alexey-milovidov Jul 30, 2024

Uh oh!

FrankChen021 commented Aug 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

CurtizJ commented May 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

CI Settings (Only check the boxes if you know what you are doing):

Uh oh!

robot-ch-test-poll3 commented May 29, 2024 • edited by robot-clickhouse-ci-2 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CurtizJ commented May 30, 2024

Uh oh!

CurtizJ commented Jun 12, 2024

Uh oh!

Uh oh!

nickitat Jun 12, 2024

Choose a reason for hiding this comment

Uh oh!

CurtizJ Jun 17, 2024

Choose a reason for hiding this comment

Uh oh!

nickitat Jun 17, 2024

Choose a reason for hiding this comment

Uh oh!

nickitat Jul 2, 2024

Choose a reason for hiding this comment

Uh oh!

CurtizJ Jul 8, 2024

Choose a reason for hiding this comment

Uh oh!

CurtizJ commented Jul 8, 2024

Uh oh!

CurtizJ commented Jul 9, 2024

Uh oh!

alexey-milovidov Jul 30, 2024

Choose a reason for hiding this comment

Uh oh!

FrankChen021 commented Aug 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

CurtizJ commented May 29, 2024 •

edited

Loading

robot-ch-test-poll3 commented May 29, 2024 •

edited by robot-clickhouse-ci-2

Loading