Skip to content

Additional primary key scan for skip index FINAL queries#70210

Closed
shiyer7474 wants to merge 27 commits intoClickHouse:masterfrom
shiyer7474:additional_pk_scan_for_skip_index_final
Closed

Additional primary key scan for skip index FINAL queries#70210
shiyer7474 wants to merge 27 commits intoClickHouse:masterfrom
shiyer7474:additional_pk_scan_for_skip_index_final

Conversation

@shiyer7474
Copy link
Copy Markdown
Contributor

@shiyer7474 shiyer7474 commented Oct 1, 2024

Background in #34243 and #31411. Patch will make sure that correct results are returned by a FINAL query using a skip index and should significantly improve performance of FINAL queries. Solution takes the initial set of PK ranges returned by the skip index and then scans those PK ranges across all unmerged partitions.

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

New setting introduced : use_skip_indexes_in_final_exact_mode. If a query has FINAL clause, then skipping data based on indexes may produce incorrect result. This setting can ensure that correct results are returned by scanning newer parts that have overlap with the ranges returned by the skip index. Set to 0 to disable, 1 to enable.

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

CI Settings (Only check the boxes if you know what you are doing):

  • Allow: All Required Checks
  • Allow: Stateless tests
  • Allow: Stateful tests
  • Allow: Integration Tests
  • Allow: Performance tests
  • Allow: All Builds
  • Allow: batch 1, 2 for multi-batch jobs
  • Allow: batch 3, 4, 5, 6 for multi-batch jobs

  • Exclude: Style check
  • Exclude: Fast test
  • Exclude: All with ASAN
  • Exclude: All with TSAN, MSAN, UBSAN, Coverage
  • Exclude: All with aarch64, release, debug

  • Run only fuzzers related jobs (libFuzzer fuzzers, AST fuzzers, etc.)
  • Exclude: AST fuzzers

  • Do not test
  • Woolen Wolfdog
  • Upload binaries for special builds
  • Disable merge-commit
  • Disable CI cache

@shiyer7474
Copy link
Copy Markdown
Contributor Author

Couple of points to review in the solution :-

  1. Will the solution be generally useful? It depends, but there is a big room considering that the alternative is full table scan.

  2. Additional cost of the 2nd pass? There is some CPU burnt in the primary key range scans, depends on the number of unmerged parts. Additional granules could definitely be read, based on application update patterns. But the key is that the solution guarantees correct results by any FINAL query with skip index.

Test results (wrong) without the patch and using secondary skip index scan :-

SET use_skip_indexes = 1;
SET use_skip_indexes_if_final = 1;

SELECT 'Next 4 queries should return 0 rows and the 5th query should return 1 row'

Query id: b46ddcb1-9ce2-4b7d-9213-185ddf53d3d3

   ┌─'Next 4 queries should return 0 rows and the 5th query should return 1 row'─┐
1. │ Next 4 queries should return 0 rows and the 5th query should return 1 row   │
   └─────────────────────────────────────────────────────────────────────────────┘

1 row in set. Elapsed: 0.003 sec.


SELECT count(id)
FROM t_final_query_tbl
FINAL
WHERE v = 98889991
SETTINGS skip_indexes_in_final_correctness_threshold = 0

Query id: b3fd7adc-1205-4326-a5fc-3c60bfe7396d

   ┌─count(id)─┐
1. │         1 │
   └───────────┘

1 row in set. Elapsed: 0.011 sec. Processed 8.19 thousand rows, 131.07 KB (772.64 thousand rows/s., 12.36 MB/s.)
Peak memory usage: 403.79 KiB.


SELECT count(id)
FROM t_final_query_tbl
FINAL
WHERE v = 88889992
SETTINGS skip_indexes_in_final_correctness_threshold = 0

Query id: b4326272-df2f-4837-b352-3da66950ba40

   ┌─count(id)─┐
1. │         1 │
   └───────────┘

1 row in set. Elapsed: 0.013 sec. Processed 16.38 thousand rows, 262.14 KB (1.25 million rows/s., 20.02 MB/s.)
Peak memory usage: 423.87 KiB.


SELECT count(id)
FROM t_final_query_tbl
FINAL
WHERE v = 78889993
SETTINGS skip_indexes_in_final_correctness_threshold = 0

Query id: fc0257b4-4a41-43ba-a713-7515852e9517

   ┌─count(id)─┐
1. │         1 │
   └───────────┘

1 row in set. Elapsed: 0.018 sec. Processed 24.58 thousand rows, 393.22 KB (1.37 million rows/s., 21.86 MB/s.)
Peak memory usage: 550.71 KiB.


SELECT count(id)
FROM t_final_query_tbl
FINAL
WHERE v = 68889994
SETTINGS skip_indexes_in_final_correctness_threshold = 0

Query id: 8ca18045-7851-4a04-af5b-a2c2a917f6d4

   ┌─count(id)─┐
1. │         1 │
   └───────────┘

1 row in set. Elapsed: 0.019 sec. Processed 32.77 thousand rows, 524.29 KB (1.69 million rows/s., 27.07 MB/s.)
Peak memory usage: 555.95 KiB.


SELECT count(id)
FROM t_final_query_tbl
FINAL
WHERE v = 58889995
SETTINGS skip_indexes_in_final_correctness_threshold = 0

Query id: 3a51d3d2-2344-4014-913c-c32349ae1767

   ┌─count(id)─┐
1. │         1 │
   └───────────┘

1 row in set. Elapsed: 0.017 sec. Processed 40.96 thousand rows, 655.36 KB (2.39 million rows/s., 38.26 MB/s.)
Peak memory usage: 610.93 KiB.

Correct test results with the patch (more rows scanned but much less than full table scan)

SELECT 'Next 4 queries should return 0 rows and the 5th query should return 1 row'

Query id: 7de4c8a6-055d-4db9-8169-52242351eee2

   ┌─'Next 4 queries should return 0 rows and the 5th query should return 1 row'─┐
1. │ Next 4 queries should return 0 rows and the 5th query should return 1 row   │
   └─────────────────────────────────────────────────────────────────────────────┘

1 row in set. Elapsed: 0.002 sec.


SELECT count(id)
FROM t_final_query_tbl
FINAL
WHERE v = 98889991
SETTINGS skip_indexes_in_final_correctness_threshold = 1

Query id: d7a39d86-697f-48aa-bc40-a55802fea539

   ┌─count(id)─┐
1. │         0 │
   └───────────┘

1 row in set. Elapsed: 0.025 sec. Processed 139.26 thousand rows, 2.23 MB (5.57 million rows/s., 89.12 MB/s.)
Peak memory usage: 1.55 MiB.


SELECT count(id)
FROM t_final_query_tbl
FINAL
WHERE v = 88889992
SETTINGS skip_indexes_in_final_correctness_threshold = 1

Query id: 53a575cc-4640-49c1-8c5e-8ab4f416fd78

   ┌─count(id)─┐
1. │         0 │
   └───────────┘

1 row in set. Elapsed: 0.025 sec. Processed 139.26 thousand rows, 2.23 MB (5.64 million rows/s., 90.16 MB/s.)
Peak memory usage: 1.55 MiB.


SELECT count(id)
FROM t_final_query_tbl
FINAL
WHERE v = 78889993
SETTINGS skip_indexes_in_final_correctness_threshold = 1

Query id: dc6e3a5c-507f-4844-a9df-e3695215e00f

   ┌─count(id)─┐
1. │         0 │
   └───────────┘

1 row in set. Elapsed: 0.027 sec. Processed 139.26 thousand rows, 2.23 MB (5.14 million rows/s., 82.18 MB/s.)
Peak memory usage: 1.71 MiB.


SELECT count(id)
FROM t_final_query_tbl
FINAL
WHERE v = 68889994
SETTINGS skip_indexes_in_final_correctness_threshold = 1

Query id: eada4a69-f385-4031-a460-19f7dea95560

   ┌─count(id)─┐
1. │         0 │
   └───────────┘

1 row in set. Elapsed: 0.029 sec. Processed 139.26 thousand rows, 2.23 MB (4.84 million rows/s., 77.48 MB/s.)
Peak memory usage: 1.70 MiB.


SELECT count(id)
FROM t_final_query_tbl
FINAL
WHERE v = 58889995
SETTINGS skip_indexes_in_final_correctness_threshold = 1

Query id: d11acbc5-acb0-40f0-8079-882dc46946a9

   ┌─count(id)─┐
1. │         1 │
   └───────────┘


- 0 — Disabled.
- 1 — Enabled.
>= 2 - Limit on the number of granules returned by the skip index, beyond which feature is disabled.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit unusual, is there any other setting that have this multiple special values together with simple semantics of numerical value?

Maybe reconsider this one?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, good point, I just went for a quick fix solution.

A knob is needed to turn ON/OFF the fix. It is perfectly safe to set the existing use_skip_indexes_if_final to 1 if it is known that the secondary index column value never mutates after insert (even if other column values are updated). Thus enabling this new fix by default will cause a performance regression as more rows could be scanned than current.

Some additional protection maybe required if the first step secondary index lookup returns 100's of granules. The KeyCondition for the 2nd step could then contain 100s of RPN nodes and take memory / CPU for the repeated searches. Hence some setting for exiting out of the fix is needed if there are too many ranges returned by the first step. An example of the KeyCondition where the primary key has 3 column and 12 granules are returned in the first pass :-

(column 0 in ['303ed4c698', '32b30a250a']), (column 1 in [80, 99]), and, (column 2 in [647080, 809299]), and,
(column 0 in ['5ea1649a31', '5fd0b37cd7']), (column 1 in [64, 83]), and, (column 2 in [537064, 114283]), and, 
or, (column 0 in ['6da37dd313', '6f2268bd1d']), (column 1 in [17, 36]), and, (column 2 in [331617, 753836]), and,
or, (column 0 in ['7143d7fbad', '7380ad8a67']), (column 1 in [56, 75]), and, (column 2 in [781056, 497275]), and,
or, (column 0 in ['8f468c873a', '912d2b1c7b']), (column 1 in [5, 24]), and, (column 2 in [968605, 751824]), and, 
or, (column 0 in ['c15da1f2b5', 'c32d9bf27a']), (column 1 in [47, 66]), and, (column 2 in [788247, 979466]), and, 
or, (column 0 in ['5b8add2a5d', '5ea1649a31']), (column 1 in [44, 64]), and, (column 2 in [317844, 537064]), and, 
or, (column 0 in ['a1d0c6e83f', 'a4a042cf4f']), (column 1 in [78, 97]), and, (column 2 in [42478, 171697]), and, 
or, (column 0 in ['a64c94baaf', 'a87ff679a2']), (column 1 in [16, 36]), and, (column 2 in [898816, 4036]), and, 
or, (column 0 in ['05049e90fa', '0640966322']), (column 1 in [57, 76]), and, (column 2 in [367657, 158876]), and, 
or, (column 0 in ['7b13b22030', '7eabe3a164']), (column 1 in [13, 32]), and, (column 2 in [993613, 206832]), and, 
or, (column 0 in ['f4be00279e', 'f7e9050c92']), (column 1 in [27, 65]), and, (column 2 in [528227, 854665]), and, or

I will rework this setting stuff.

@shiyer7474 shiyer7474 marked this pull request as draft October 14, 2024 06:30
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-performance Pull request with some performance improvements label Oct 14, 2024
@nickitat nickitat added the can be tested Allows running workflows for external contributors label Oct 14, 2024
@robot-clickhouse
Copy link
Copy Markdown
Member

robot-clickhouse commented Oct 14, 2024

This is an automated comment for commit 56fc3dc with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check nameDescriptionStatus
Stateless testsRuns stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc❌ failure
Stress testRuns stateless functional tests concurrently from several clients to detect concurrency-related errors❌ failure
Successful checks
Check nameDescriptionStatus
AST fuzzerRuns randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help✅ success
BuildsThere's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
BuzzHouse (asan)There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
BuzzHouse (debug)There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
BuzzHouse (msan)There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
BuzzHouse (tsan)There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
BuzzHouse (ubsan)There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS✅ success
ClickBenchRuns ClickBench with instant-attach table✅ success
Compatibility checkChecks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help✅ success
Docker keeper imageThe check to build and optionally push the mentioned image to docker hub✅ success
Docker server imageThe check to build and optionally push the mentioned image to docker hub✅ success
Docs checkBuilds and tests the documentation✅ success
Fast testNormally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here✅ success
Flaky testsChecks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integration tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc✅ success
Install packagesChecks that the built packages are installable in a clear environment✅ success
Integration testsThe integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests✅ success
Performance ComparisonMeasure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests✅ success
Stateful testsRuns stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc✅ success
Style checkRuns a set of checks to keep the code style clean. If some of tests failed, see the related log from the report✅ success
Unit testsRuns the unit tests for different release types✅ success
Upgrade checkRuns stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts✅ success

@shiyer7474 shiyer7474 marked this pull request as ready for review October 16, 2024 15:30
@nickitat nickitat self-assigned this Oct 16, 2024
first_part_for_next_pass = part_index; /// we only need to check in newer parts
}
static constexpr size_t ranges_threshold = 1000;
if (range_count > ranges_threshold)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nickitat Thank you picking up this PR for review! Let me know your overall feedback and if you have any queries. Specifically here, do you think this thresholding is required? I added this protection, but now feel that even if there are more ranges, the slightly additional cost of the "PK search" 2nd pass should be okay. For tables with billions of rows, this limit of 1000 ranges returned by secondary index looks really low.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the main challenge here is to make it efficient for wider set of tables and queries. number of parts could be big (10^3-10^4), number of ranges even bigger by a couple of orders of magnitude.
we likely need a threshold to abandon the optimisation at some point. but before that we could:

  1. try to "compact" PK conditions. e.g. if we have conditions like in [1; 3], [5; 8] - we could merge those into a single one - in [1; 8]. this way we could maintain number of conditions limited. also possible to decide at some point that we got too far and it won't filter anything and give up on the optimisation.
  2. what i was talking about in the issue: we extract non-intersecting parts of PK and don't apply merging algorithm to them in runtime.
    SplitPartsWithRangesByPrimaryKeyResult split_ranges_result = splitPartsWithRangesByPrimaryKey(
    storage_snapshot->metadata->getPrimaryKey(),
    sorting_expr,
    std::move(new_parts),
    num_streams,
    context,
    std::move(in_order_reading_step_getter),
    split_parts_ranges_into_intersecting_and_non_intersecting_final,
    settings[Setting::split_intersecting_parts_ranges_into_layers_final]);
    for (auto && non_intersecting_parts_range : split_ranges_result.non_intersecting_parts_ranges)
    non_intersecting_parts_by_primary_key.push_back(std::move(non_intersecting_parts_range));
    for (auto && merging_pipe : split_ranges_result.merging_pipes)
    pipes.push_back(std::move(merging_pipe));
    seems like we also could apply skipping indices to them as if they were parts from ordinary MergeTree tables.

@shiyer7474
Copy link
Copy Markdown
Contributor Author

try to "compact" PK conditions. e.g. if we have conditions like in [1; 3], [5; 8] - we could merge those into a single one - in [1; 8]. this way we could maintain number of conditions limited. also possible to decide at some point that we got too far and it won't filter anything and give up on the optimisation.

My understanding is that compaction/merge of adjacent ranges is already done by existing code in MergeTreeDataSelectExecutor::filterMarksUsingIndex(). Compaction of "near" ranges is controlled by the merge_tree_min_rows_for_seek engine setting. Examples of the KeyCondition generated by the new code when adjacent ranges are shortlisted by the skip index -

//only 1 range :-
select id from x FINAL where v = 1200  settings use_skip_indexes_if_final_exact_mode=1;
2024.10.23 10:23:18.536942 [ 1209207 ] {4584911a-cc9f-4748-bee9-9b98c7a2f883} new KeyCondition (column 0 in [0, 8192])

//2 ranges :-
select id from x FINAL where v = 1200  or v = 9200 settings use_skip_indexes_if_final_exact_mode=1;
2024.10.23 10:23:28.514977 [ 1209207 ] {baf3ac1b-4456-432a-81e8-15b8b7908d7f} new KeyCondition (column 0 in [0, 16384])

//3 ranges :-
select id from x FINAL where v = 1200  or v = 9200 or v = 20100 settings use_skip_indexes_if_final_exact_mode=1;
2024.10.23 10:23:48.000960 [ 1209207 ] {2b4fd48d-d2af-4eb2-9b8c-afeb3c6064d5} new KeyCondition (column 0 in [0, 24576])

what i was talking about in the issue: we extract non-intersecting parts of PK and don't apply merging algorithm to them in runtime...seems like we also could apply skipping indices to them as if they were parts from ordinary MergeTree tables.

we now able to skip final merging for ranges with non-intersecting PK and for them we don't need this special procedure at all.

Hmmm...splitPartsWithRangesByPrimaryKey() is executed after the skip index was looked up and an incorrect/incomplete set of ranges was returned by filterMarksUsingIndex() . Thus at this point, the intersecting and non-intersecting PK ranges are also incorrect/incomplete. Let me try out something here along the lines of .splitPartsWithRangesByPrimaryKey(), I will get back in a day. Thanks!

@shiyer7474
Copy link
Copy Markdown
Contributor Author

The code in splitPartsWithRangesByPrimaryKey() is pretty complex, but it has close resemblance to what this PR is trying to to :-

splitPartsWithRangesByPrimaryKey() -> Given a collection of parts + ranges, split into intersecting-PK and non-intersecting-PK ranges.

This PR -> Given a set of shortlisted parts + ranges (via skip index), find intersecting-PK ranges in all rejected ranges /parts

splitPartsWithRangesByPrimaryKey() has multiple code paths because it has to handle ranges that cover more than 1 granule. In fact a range could cover an entire part of 100s of granules.

This PR's current implementation of using a KeyCondition representing the shortlisted ranges (could be 100s/1000s) and then repeated RPN evaluation over the newer parts/ranges can be replaced with similar to how splitPartsWithRangesByPrimaryKey() is implemented. A quick sketch of what I have implemented -

[skip_index_selected_ranges, skip_index_rejected_ranges] = MergeTreeDataSelectExecutor::filterPartsByPrimaryKeyAndSkipIndexes()

std::sort(skip_index_selected_ranges.begin(), skip_index_selected_ranges.end()); // sort by PK value in range.start
std::sort(skip_index_rejected_ranges.begin(), skip_index_rejected_ranges.end()); // sort by PK value in range.start

run a single pass INTERSECT traversing skip_index_selected_ranges and skip_index_rejected_ranges

@nickitat Let me know your opinion! I am testing the above code implementation.

@shiyer7474
Copy link
Copy Markdown
Contributor Author

The failures in the 5 fuzzer suites are same - the new code checks that the PK columns have strict weak ordering, the fuzzer changes the PK column to Nullable. The exception informs the user to rerun the query with use_skip_indexes_if_final_exact_mode=0.

CREATE TABLE t_final_query_tbl__fuzz_24
(
    `id` Nullable(UInt32),
    `v` DateTime,
    INDEX secondaryidx v TYPE minmax GRANULARITY 1
)
ENGINE = ReplacingMergeTree
ORDER BY id

Query id: 79f2bbb1-b381-4923-915c-703a4f970032


Elapsed: 0.050 sec.

Received exception from server (version 24.10.1):
Code: 44. DB::Exception: Received from localhost:9000. DB::Exception: Sorting key contains nullable columns, but merge tree setting `allow_nullable_key` is disabled. (ILLEGAL_COLUMN)

CREATE TABLE t_final_query_tbl__fuzz_24
(
    `id` Nullable(UInt32),
    `v` DateTime,
    INDEX secondaryidx v TYPE minmax GRANULARITY 1
)
ENGINE = ReplacingMergeTree
ORDER BY id
SETTINGS allow_nullable_key = 1

Query id: c679c6a1-5159-4315-b694-936e950f9097

Ok.

0 rows in set. Elapsed: 0.016 sec.

SELECT id
FROM t_final_query_tbl__fuzz_24
FINAL
WHERE v = 5
SETTINGS use_skip_indexes_if_final = 1, use_skip_indexes_if_final_exact_mode = 1

Query id: 5bacf28b-449b-41d0-a3d2-a95eef4e1831


Elapsed: 0.054 sec.

Received exception from server (version 24.10.1):
Code: 49. DB::Exception: Received from localhost:9000. DB::Exception: Primary key cannot be used in this query for expanding skip index results. Please reexecute query with setting use_skip_indexes_if_final = 0 & use_skip_indexes_if_final_exact_mode = 0.. (LOGICAL_ERROR)

@shiyer7474
Copy link
Copy Markdown
Contributor Author

@nickitat Please review the latest upload! I have reworked the implementation to use a simple sort + intersect to find the expanded set of PK ranges. As a data point, the earlier KeyCondition based implementation took around 35 seconds when the KeyCondition had 10000+ PK granules to compare with about 65000+ granules (PK with 3 columns). The same case takes sub-second in the new implementation. I have explained the test failure above, trying to figure out best fix for that (could just do a union of the selected & rejected ranges if PK does not have strict weak ordering)

@shiyer7474
Copy link
Copy Markdown
Contributor Author

Another alternative that I analyzed was :

Step 1: Do not use skip indexes in the first pass
Step 2: Fall into existing splitPartsWithRangesByPrimaryKey() to split the entire granules set (all parts/all granules) into intersecting & non-intersecting ranges. (Implementation uses sort and traverse logic)
Step 3: New 2nd pass - Use the skip index only on the non-intersecting ranges to shortlist matching ranges and discard other ranges. I don't think the skip index can be used on the intersecting ranges for further trimming(It would be somewhat like find intersecting ranges from intersecting ranges).

Above is good if the proportion of non-intersecting ranges is high. But all the intersecting ranges will have to read.

Comparing the above with what is implemented :
Step 1: Use skip index and shortlist ranges from all parts/ranges
Step 2: Perform intersection of shortlisted ranges with ranges from newer parts and collect the final set of ranges to be read. (Implementation uses sort and merge)
Step 3: Fall into existing splitPartsWithRangesByPrimaryKey()

Step 2 and Step 3 in the first solution kind of correspond to Step 2 and Step 1 in second solution, and same cost? Step 3 in the 2nd solution looks extra but should be minimal cost if the skip index was effective. Solution 1 could read a lot of ranges unnecessarily (the original intersecting set). The current solution can be improved by avoiding step 3 - step 2 itself can emit the intersecting and non-intersecting ranges when performing the merge!

One more data point with large dataset :
EXPLAIN indexes = 1
SELECT *
FROM t_final_query_tbl2
FINAL
WHERE v = 2111111111111111111
SETTINGS use_skip_indexes_if_final = 1, use_skip_indexes_if_final_exact_mode = 1

Query id: e3422b9d-10b5-4ac1-af2b-63e53b9cea63

    ┌─explain────────────────────────────────────────────────────────┐
 1. │ Expression ((Project names + Projection))                      │
 2. │   Filter ((WHERE + Change column names to column identifiers)) │
 3. │     ReadFromMergeTree (default.t_final_query_tbl2)             │
 4. │     Indexes:                                                   │
 5. │       PrimaryKey                                               │
 6. │         Condition: true                                        │
 7. │         Parts: 988/988                                         │
 8. │         Granules: 537163/537163                                │
 9. │       Skip                                                     │
10. │         Name: secondaryidx                                     │
11. │         Description: bloom_filter GRANULARITY 1                │
12. │         Parts: 988/988                                         │
13. │         Granules: 13054/537163                                 │
14. │       PrimaryKeyExpand                                         │
15. │         Description: PrimaryKeyExpandForFinal                  │
16. │         Parts: 988/988                                         │
17. │         Granules: 537131/13054                                 │
    └────────────────────────────────────────────────────────────────┘

17 rows in set. Elapsed: 1.577 sec.

@shiyer7474
Copy link
Copy Markdown
Contributor Author

Only test failure in TSAN suite is in 01168_mutations_isolation #67842

I am closing this PR, will reopen if any review activity comes up.

@Enmk
Copy link
Copy Markdown
Contributor

Enmk commented Dec 9, 2024

@nickitat ping, could you please take a look?

@alexey-milovidov alexey-milovidov mentioned this pull request Dec 31, 2024
76 tasks
@shiyer7474
Copy link
Copy Markdown
Contributor Author

Thanks for including this in the 2025 roadmap! Let me complete the other approach and get some numbers. #70210 (comment)

@shiyer7474
Copy link
Copy Markdown
Contributor Author

I had to refresh and merge to master and also move my git to a new VM.

Latest CI run shows 2 job failures -
Integration tests (aarch64) [5/6] failure integration_tests__aarch64__[5_6].html
Stateless tests (ubsan) [2/2] failure stateless_tests__ubsan__[2_2].html

and the failing tests are -

test_storage_s3_queue/test.py::test_list_and_delete_race
02998_native_parquet_reader

Both failures are seen in other CI runs in the last week.

@shiyer7474
Copy link
Copy Markdown
Contributor Author

I completed the current implementation (earlier version did not cover all cases for range intersection) and also performed extensive benchmarking of the current implementation and an alternate implementation based on sorting (similar to the algorithm in splitPartsRanges(). The alternate implementation can be seen in an earlier commit :-

56fc3dc

The alternate implementation is slower and has higher elapsed time compared to the current implementation (close to 50%). The additional time goes in sorting the single combined list of RangeStart's & RangeEnd's. Whereas in the current implementation, the sort of the separate "selected" ranges list and "rejected" ranges list (only RangeStart's) is faster because range starts within a part are already ordered.

The code change to coalesce ranges in RangesInDataPartsBuilder::addRange() in the latest upload helps in reducing the time spent later in splitIntersectingPartsRangesIntoLayers(). Some numbers in the next comment.

@shiyer7474
Copy link
Copy Markdown
Contributor Author

shiyer7474 commented Jan 21, 2025

For some absolute numbers from sub-optimal usecases with random primary key -

$ explain indexes=1  SELECT count(id1) FROM x FINAL where v < 800000 SETTINGS use_skip_indexes_if_final_exact_mode=1,use_skip_indexes=1,use_skip_indexes_if_final=1;

EXPLAIN indexes = 1
SELECT count(id1)
FROM x
FINAL
WHERE v < 800000
SETTINGS use_skip_indexes_if_final_exact_mode = 1, use_skip_indexes = 1, use_skip_indexes_if_final = 1

Query id: 67716f36-4491-4521-accb-5e5254021cf7

    ┌─explain────────────────────────────────────────────────────────────┐
 1. │ Expression ((Project names + Projection))                          │
 2. │   Aggregating                                                      │
 3. │     Expression (Before GROUP BY)                                   │
 4. │       Filter ((WHERE + Change column names to column identifiers)) │
 5. │         ReadFromMergeTree (default.x)                              │
 6. │         Indexes:                                                   │
 7. │           PrimaryKey                                               │
 8. │             Condition: true                                        │
 9. │             Parts: 1338/1338                                       │
10. │             Granules: 2929978/2929978                              │
11. │           Skip                                                     │
12. │             Name: secondaryidx                                     │
13. │             Description: minmax GRANULARITY 1                      │
14. │             Parts: 15/1338                                         │
15. │             Granules: 26820/2929978                                │
16. │           PrimaryKeyExpand                                         │
17. │             Description: PrimaryKeyExpandForFinal                  │
18. │             Parts: 1338/15                                         │
19. │             Granules: 2747434/26820                                │
    └────────────────────────────────────────────────────────────────────┘

19 rows in set. Elapsed: 5.757 sec.

There has been a regression in the last 2 weeks. The same statement used to finish in the range of 3.4 - 3.8 seconds, so that is a regression of more than 50%. Following commit to introduce a skipping index cache looks to have caused this -

#70102

Disabling the skip index cache degrades the performance even more -

mysql9 :) SELECT metric, value FROM system.metrics WHERE metric = 'SkippingIndexCacheSize';

SELECT
    metric,
    value
FROM system.metrics
WHERE metric = 'SkippingIndexCacheSize'

Query id: d9e9beb1-feb3-428d-962a-f38eee7e996d

   ┌─metric─────────────────┬─value─┐
1. │ SkippingIndexCacheSize │     0 │
   └────────────────────────┴───────┘

1 row in set. Elapsed: 0.003 sec.

mysql9 :) explain indexes=1  SELECT count(id1) FROM x FINAL where v < 800000 SETTINGS use_skip_indexes_if_final_exact_mode=1,use_skip_indexes=1,use_skip_indexes_if_final=1;

EXPLAIN indexes = 1
SELECT count(id1)
FROM x
FINAL
WHERE v < 800000
SETTINGS use_skip_indexes_if_final_exact_mode = 1, use_skip_indexes = 1, use_skip_indexes_if_final = 1

Query id: 612bdf19-b6ea-4b83-b6b8-420e24eedc6d

    ┌─explain────────────────────────────────────────────────────────────┐
 1. │ Expression ((Project names + Projection))                          │
 2. │   Aggregating                                                      │
 3. │     Expression (Before GROUP BY)                                   │
 4. │       Filter ((WHERE + Change column names to column identifiers)) │
 5. │         ReadFromMergeTree (default.x)                              │
 6. │         Indexes:                                                   │
 7. │           PrimaryKey                                               │
 8. │             Condition: true                                        │
 9. │             Parts: 1338/1338                                       │
10. │             Granules: 2929978/2929978                              │
11. │           Skip                                                     │
12. │             Name: secondaryidx                                     │
13. │             Description: minmax GRANULARITY 1                      │
14. │             Parts: 15/1338                                         │
15. │             Granules: 26820/2929978                                │
16. │           PrimaryKeyExpand                                         │
17. │             Description: PrimaryKeyExpandForFinal                  │
18. │             Parts: 1338/15                                         │
19. │             Granules: 2747434/26820                                │
    └────────────────────────────────────────────────────────────────────┘

19 rows in set. Elapsed: 9.534 sec.

But running against 2 week old repo without the skip index cache commit , shows a much better elapsed time compared to skip index cache enabled or disabled-

mysql9 :) SELECT metric, value FROM system.metrics WHERE metric = 'SkippingIndexCacheSize';

SELECT
    metric,
    value
FROM system.metrics
WHERE metric = 'SkippingIndexCacheSize'

Query id: 7b31f3e4-b374-48e3-9343-1c00586df00a

Ok.

0 rows in set. Elapsed: 0.002 sec.

mysql9 :) explain indexes=1  SELECT count(id1) FROM x FINAL where v < 800000 SETTINGS use_skip_indexes_if_final_exact_mode=1,use_skip_indexes=1,use_skip_indexes_if_final=1;

EXPLAIN indexes = 1
SELECT count(id1)
FROM x
FINAL
WHERE v < 800000
SETTINGS use_skip_indexes_if_final_exact_mode = 1, use_skip_indexes = 1, use_skip_indexes_if_final = 1

Query id: 064918e1-54dd-4215-ab74-9f754fe36e32

    ┌─explain────────────────────────────────────────────────────────────┐
 1. │ Expression ((Project names + Projection))                          │
 2. │   Aggregating                                                      │
 3. │     Expression (Before GROUP BY)                                   │
 4. │       Filter ((WHERE + Change column names to column identifiers)) │
 5. │         ReadFromMergeTree (default.x)                              │
 6. │         Indexes:                                                   │
 7. │           PrimaryKey                                               │
 8. │             Condition: true                                        │
 9. │             Parts: 1338/1338                                       │
10. │             Granules: 2929978/2929978                              │
11. │           Skip                                                     │
12. │             Name: secondaryidx                                     │
13. │             Description: minmax GRANULARITY 1                      │
14. │             Parts: 15/1338                                         │
15. │             Granules: 26820/2929978                                │
16. │           PrimaryKeyExpand                                         │
17. │             Description: PrimaryKeyExpandForFinal                  │
18. │             Parts: 1338/15                                         │
19. │             Granules: 2747434/26820                                │
    └────────────────────────────────────────────────────────────────────┘

19 rows in set. Elapsed: 3.463 sec.

@shiyer7474
Copy link
Copy Markdown
Contributor Author

Above numbers from a 8-CPU (model name : Intel(R) Xeon(R) CPU E5-1630 v3 @ 3.70GHz)

With 2.7 million granules (exact 2747434), the overhead of the feature is about 3.4 seconds (without the skip index cache feature) and there is a gain of close to a second further down in splitting part ranges into layers. That's just 2 1/2 seconds in the worst case!

@shiyer7474
Copy link
Copy Markdown
Contributor Author

These 3 functions are on the top compared to a report of 2 weeks ago.

     7.11%  MergeTreeIndex  [kernel.kallsyms]  [k] try_to_wake_up
     5.26%  MergeTreeIndex  clickhouse         [.] DB::SLRUCachePolicy<wide::integer<128ul, unsigned int>, DB::SkippingIndexCacheCell,
 UInt128TrivialHash, DB::SkippingIndexCacheWeightFunction>::get
     5.18%  MergeTreeIndex  [kernel.kallsyms]  [k] entry_SYSRETQ_unsafe_stack

@shiyer7474
Copy link
Copy Markdown
Contributor Author

On a 48-CPU VM with max_threads=48, the skipping index cache causes significant increase in execution timing -

  1. On older git commit
    commit 6b30301 (HEAD)
    Merge: 3ddd7c9 06eeb53
    Author: Alexey Milovidov milovidov@clickhouse.com
    Date: Sat Jan 11 05:30:43 2025 +0100
 explain indexes=1 SELECT count(id1) FROM x FINAL where v < 2000000 SETTINGS use_skip_indexes_if_final_exact_mode=1,use_skip_indexes=1,use_skip_indexes_if_final=1;

EXPLAIN indexes = 1
SELECT count(id1)
FROM x
FINAL
WHERE v < 2000000
SETTINGS use_skip_indexes_if_final_exact_mode = 1, use_skip_indexes = 1, use_skip_indexes_if_final = 1

Query id: 3a21c2e1-51c0-4a15-8961-932f19b850f9

    ┌─explain────────────────────────────────────────────────────────────┐
 1. │ Expression ((Project names + Projection))                          │
 3. │   Aggregating                                                      │
 4. │     Expression (Before GROUP BY)                                   │
 5. │       Filter ((WHERE + Change column names to column identifiers)) │
 6. │         ReadFromMergeTree (default.x)                              │
 7. │         Indexes:                                                   │
 8. │           PrimaryKey                                               │
 9. │             Condition: true                                        │
 10. │             Parts: 1350/1350                                       │
11. │             Granules: 2929980/2929980                              │
12. │           Skip                                                     │
13. │             Name: secondaryidx                                     │
14. │             Description: minmax GRANULARITY 1                      │
15. │             Parts: 30/1350                                         │
16. │             Granules: 61095/2929980                                │
17. │           PrimaryKeyExpand                                         │
18. │             Description: PrimaryKeyExpandForFinal                  │
19. │             Parts: 1350/30                                         │
20. │             Granules: 2929230/61095                                │
    └────────────────────────────────────────────────────────────────────┘

19 rows in set. Elapsed: 2.073 sec.

That's just 2 seconds for looking up the minmax index and then performing the "PrimaryKeyExpand" procedure from this PR over close to 3M granules.

  1. With skipping_index_cache enabled and default value -
explain indexes=1 SELECT count(id1) FROM x FINAL where v < 2000000 SETTINGS use_skip_indexes_if_final_exact_mode=1,use_skip_indexes=1,use_skip_indexes_if_final=1;

EXPLAIN indexes = 1
SELECT count(id1)
FROM x
FINAL
WHERE v < 2000000
SETTINGS use_skip_indexes_if_final_exact_mode = 1, use_skip_indexes = 1, use_skip_indexes_if_final = 1

Query id: 0ccaebef-3250-466d-ae0b-982adb63a1e8

    ┌─explain────────────────────────────────────────────────────────────┐
 1. │ Expression ((Project names + Projection))                          │
 2. │   Aggregating                                                      │
 3. │     Expression (Before GROUP BY)                                   │
 4. │       Filter ((WHERE + Change column names to column identifiers)) │
 5. │         ReadFromMergeTree (default.x)                              │
 6. │         Indexes:                                                   │
 7. │           PrimaryKey                                               │
 8. │             Condition: true                                        │
 9. │             Parts: 1338/1338                                       │
10. │             Granules: 2929978/2929978                              │
11. │           Skip                                                     │
12. │             Name: secondaryidx                                     │
13. │             Description: minmax GRANULARITY 1                      │
14. │             Parts: 30/1338                                         │
15. │             Granules: 61095/2929978                                │
16. │           PrimaryKeyExpand                                         │
17. │             Description: PrimaryKeyExpandForFinal                  │
18. │             Parts: 1338/30                                         │
19. │             Granules: 2929227/61095                                │
    └────────────────────────────────────────────────────────────────────┘

19 rows in set. Elapsed: 8.596 sec.
  1. With skipping index cache disabled :-
explain indexes=1 SELECT count(id1) FROM x FINAL where v < 2000000 SETTINGS use_skip_indexes_if_final_exact_mode=1,use_skip_indexes=1,use_skip_indexes_if_final=1;

EXPLAIN indexes = 1
SELECT count(id1)
FROM x
FINAL
WHERE v < 2000000
SETTINGS use_skip_indexes_if_final_exact_mode = 1, use_skip_indexes = 1, use_skip_indexes_if_final = 1

Query id: e012bab5-d46c-48bb-9906-e81b20511ec4

    ┌─explain────────────────────────────────────────────────────────────┐
 1. │ Expression ((Project names + Projection))                          │
 2. │   Aggregating                                                      │
 3. │     Expression (Before GROUP BY)                                   │
 4. │       Filter ((WHERE + Change column names to column identifiers)) │
 5. │         ReadFromMergeTree (default.x)                              │
 6. │         Indexes:                                                   │
 7. │           PrimaryKey                                               │
 8. │             Condition: true                                        │
 9. │             Parts: 1314/1314                                       │
10. │             Granules: 2929974/2929974                              │
11. │           Skip                                                     │
12. │             Name: secondaryidx                                     │
13. │             Description: minmax GRANULARITY 1                      │
14. │             Parts: 29/1314                                         │
15. │             Granules: 62654/2929974                                │
16. │           PrimaryKeyExpand                                         │
17. │             Description: PrimaryKeyExpandForFinal                  │
18. │             Parts: 1314/29                                         │
19. │             Granules: 2929417/62654                                │
    └────────────────────────────────────────────────────────────────────┘

19 rows in set. Elapsed: 16.828 sec.

Table structure -

SHOW CREATE TABLE x

Query id: 4ed37db4-452a-4713-a5f2-459f49ca2aab

Row 1:
──────
statement: CREATE TABLE default.x
(
    `id1` String,
    `id2` UInt64,
    `id3` DateTime,
    `v` UInt64,
    INDEX secondaryidx v TYPE minmax GRANULARITY 1
)
ENGINE = ReplacingMergeTree
ORDER BY (id1, id2, id3)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Feb 25, 2025

Dear @nickitat, this PR hasn't been updated for a while. You will be unassigned. Will you continue working on it? If so, please feel free to reassign yourself.

@nickitat nickitat self-assigned this Feb 25, 2025
@shankar-iyer shankar-iyer mentioned this pull request Mar 12, 2025
1 task
DECLARE(Bool, use_skip_indexes_if_final_exact_mode, 0, R"(
Controls whether granules returned by a skipping index are expanded in newer parts to return correct results when executing a query with the FINAL modifier.

Using skip indexes may exclude rows (granules) containing the latest data which could lead to incorrect results. This setting can ensure that correct results are returned by scanning newer parts that have overlap with the ranges returned by the skip index. (Experimental)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Using skip indexes may exclude rows (granules) containing the latest data which could lead to incorrect results. This setting can ensure that correct results are returned by scanning newer parts that have overlap with the ranges returned by the skip index. (Experimental)
Using skip indexes may exclude rows (granules) containing the latest data which could lead to incorrect results. This setting can ensure that correct results are returned by scanning newer parts that have overlap with the ranges returned by the skip index.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, will correct in the next upload.

if (add_index_stat_row_for_pk_expand)
{
result.index_stats.emplace_back(ReadFromMergeTree::IndexStat{
.type = ReadFromMergeTree::IndexType::PrimaryKeyExpand,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it will confuse more than help. Let's use the Skip type and elaborate only in the description.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check.

MarkRange range;
size_t part_index;
EventType event;
bool selected;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have a comment here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, will add in the next upload.

for (auto & all_part_range : all_part_ranges)
{
const auto & index_granularity = all_part_range.data_part->index_granularity;
all_part_range.ranges = MarkRanges{{MarkRange{0, index_granularity->getMarksCountWithoutFinal()}}};
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by this code. If we filtered smth by PK condition, will we discard these filtering results and fall back to reading the whole part?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a part has been completely filtered out / rejected before skip index filtering, then we will not check ranges in that part for intersection. All parts which have 1 or more ranges shortlisted will under go this new range intersection procedure. I will show an example in the next upload.


earliest_part_found = true;
}
if (!earliest_part_found)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not getting the purpose of earliest_part_found and this condition. Could you elaborate, pls?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was an optimization (incorrect) for which I was looking for feedback. I have removed it.

}
}

LOG_TRACE(logger, "findPKRangesForFinalAfterSkipIndex : processed {} parts, selected ranges {}, rejected ranges {}", ranges_in_data_parts.size(), selected_ranges.size(), rejected_ranges.size());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for consistency

Suggested change
LOG_TRACE(logger, "findPKRangesForFinalAfterSkipIndex : processed {} parts, selected ranges {}, rejected ranges {}", ranges_in_data_parts.size(), selected_ranges.size(), rejected_ranges.size());
LOG_TRACE(logger, "findPKRangesForFinalAfterSkipIndex : processed {} parts, selected {} ranges, rejected {} ranges", ranges_in_data_parts.size(), selected_ranges.size(), rejected_ranges.size());

and we probably better to log this at the very end, when we know the actual numbers. otherwise it will be very misleading.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, will correct in the next upload.

std::vector<PartsRangesIterator>::iterator selected_ranges_iter = selected_ranges.begin();
std::vector<PartsRangesIterator>::iterator rejected_ranges_iter = rejected_ranges.begin();

while (selected_ranges_iter != selected_ranges.end() && rejected_ranges_iter != rejected_ranges.end())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to devise a way to test this logic more thoroughly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, let's discuss.


while (selected_ranges_iter != selected_ranges.end() && rejected_ranges_iter != rejected_ranges.end())
{
auto start_value1 = selected_ranges_iter->value;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

selected and rejected instead of 1 and 2 would really improve readability

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, perfect, code looks better with variables named selected_range_start , rejected_range_start etc.

@shiyer7474
Copy link
Copy Markdown
Contributor Author

Thank you @nickitat , taking a look.

@shiyer7474
Copy link
Copy Markdown
Contributor Author

I am closing this PR as I am unable to push commits into the PR branch from my new git account. I will shortly add the new PR number here, thanks!

@shiyer7474 shiyer7474 closed this Mar 26, 2025
shankar-iyer added a commit to shankar-iyer/ClickHouse that referenced this pull request Mar 26, 2025
azat added a commit to azat/ClickHouse that referenced this pull request Apr 20, 2025
* u/pr/78041:
  Diagnostic
  Fix
  Fix crash
  Import VersionHistory and SettingsInfo components
  Skip the test for now
  Increase number of messages to avoid flakiness
  Fix spelling
  Fix spelling
  Comments and cleanup
  Move common code to separate functions
  Update src/Interpreters/Cache/FileCacheSettings.cpp
  Fix test
  prefer set of matching functions instead of std::function
  full_text and inverted indexes are not supported anymore
  clang-tidy fix
  Fix
  Fix
  Fix review comments & cleanup
  Sometimes generate a not matching structure
  Fix
  remove unneded statement
  Wrap scalar subquery type in nullable or tuple in only_analyze mode
  Code deduplication
  add test case
  Fix
  Fix ASAN/MSAN failure
  Fix: Duplicate announcement received for replica ...
  build: fix POST_BUILD for keeper in case of BUILD_STANDALONE_KEEPER
  Update test.py
  Update test.py
  Set correct error for Multi request in Keeper
  No test
  fix issues
  Update cache_dynamic_resize.xml
  Update KernelHelper.cpp
  REFRESH VIEWS seems to not work
  Typo
  Fix style check
  Use field
  Remove unused field
  Cleanup
  build: use POST_BUILD over separate targets for creating clickhouse symlinks
  Fix and new setting
  Support alter database on cluster
  docs(date): move sidebar position next to other date types
  docs(date32): fix example output
  Better test
  Unbreak test
  Better test
  Maybe better
  Miscellaneous
  Add a test
  Add a test
  better logging
  Fix loading of system logs in clickhouse-local
  better warning message
  Suppress QCC in isolation hermitage
  Remove stray semicolon
  Oops, that import wasn't unused
  address pr comments
  Fix clang-20 build
  Add retry logic to test_backup_restore_new::test_system_backups
  Clarify `low_priority_query_wait_time_ms` meaning
  Disable test for msan
  Fix log message
  fix build on osx 15.4 with command line tools 16.3
  Fix typo
  Fix delta-kernel with http based endpoints, fix NOSIGN
  Improve comment generation
  Increase prob
  Update TableSnapshot.cpp
  Add a comment
  Move object schemas
  Merge master and update SettingsChangesHistory.cpp
  Remove no-random tag
  New setting
  Disable remote_filesystem_read_method, gives issues on cloud
  Empty commit to reroll CI
  Comment and less environment-dependent config
  Hopefully also fix '/test/re/max_log_ptr': node doesn't exist
  Fix znode_exists flakiness in test_refreshable_mv_in_replicated_db
  Adjust limits only for fast builds
  Preserve aggregation keys order after filter pushdown
  Allow connection pooler parameters to be changed dynamically.
  avoid duplication in the MergeTreeIndexCondition::alwaysUnknownOrTrue impls
  minor refactoring near MergeTree sinks
  Add a setting to allow dynamic cache resize
  Fix
  Clang-tidy fix
  Clang-tidy fix
  Fix flaky test 03364_estimate_compression_ratio
  Fix
  Cleanup with templates
  Always set prefix for S3 ListObject
  Add more exchange statements
  Fix test
  Fix hudi test
  Fix
  Rename objects and fix
  Fix and reduce prob
  Relax a test
  Update transactions.lib
  Relax tests
  Update SettingsChangesHistory.cpp
  Relax tests
  Preparation
  add back enable_analyzer else generateRandom() complains
  Test updates and additional check for used skip index
  Use system commands in multi-setting oracle
  Fix
  vsem privet
  Added v25.4 new features
  Delete reference file
  Delete flaky test 02421_type_json_empty_parts with old Object type
  Increase probability
  Clean parenthesis expressions and generate more cases
  clang tidy fix
  Generate small array/tuple with literal values
  Make it simple
  Don't make tuples too complicated
  size_in_cells is always required
  Update to spark 3.5
  Extend test
  Missing space
  Hope I didnt break ClickHouse
  Give warning
  Look for infinite tables
  Add hot settings for oracle
  Extend test
  Extend test
  Fix columnMappingMode.name with partition columns, add tests for schema evolution
  Update 03407_parse_date_time_best_effort_unix_timestamp_with_fraction.sql
  Support unix timestapms with fractional part in best effort DateTime64 parsing
  Implement fail on timeout
  Fix for not deterministic engines
  no-random for 1 test and need new analyzer for other test
  Fix unit test
  Update test
  Make parquet version selection less confusing
  More tests
  Modified the description of PrimaryKeyExpandForFinal
  initialized vector<bool> not thread-safe
  Initialize skip_index_used_in_part
  Correctly identify ranges filtered by PK
  Spell-check
  Code review feedback and SettingsChangesHistory update
  Disable optimization if reverse sorted keys
  Good names for start_value1/end_value1
  Moved over sources from PR ClickHouse#70210
  Remove garbage from the query condition cache and enable it by default
  Fix refreshable materialized view not working on newly added replicas
  Typo
  Fix
  More comment
  Add iteration limit in QueryFuzzer
  Relax tests a little
  Relax some tests
  Relax some tests
  Relax some tests
  Relax some tests
  Fix test
  Maybe fix tests
  Fix test
  Fix test
  Update tests
  Fix tests
  Fix build
  Update tests
  Update test
  Fix tests
  Update test
  Update test
  Add no-coverage tag
  Update tests
  Limit tests even more
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-performance Pull request with some performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants