Skip to content

Allow vertical merges from compact to wide parts#45681

Merged
KochetovNicolai merged 6 commits intoClickHouse:masterfrom
CurtizJ:compact-parts-vertical-merge
Feb 7, 2023
Merged

Allow vertical merges from compact to wide parts#45681
KochetovNicolai merged 6 commits intoClickHouse:masterfrom
CurtizJ:compact-parts-vertical-merge

Conversation

@CurtizJ
Copy link
Copy Markdown
Member

@CurtizJ CurtizJ commented Jan 26, 2023

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Allow using Vertical merge algorithm with parts in Compact format. This will allow ClickHouse server to use much less memory for background operations. This closes #46084

@robot-clickhouse robot-clickhouse added the pr-not-for-changelog This PR should not be mentioned in the changelog label Jan 26, 2023
@novikd novikd self-assigned this Jan 26, 2023
@alexey-milovidov
Copy link
Copy Markdown
Member

Let's mark it as a "Performance improvement", although it is mostly a memory usage improvement.

@robot-ch-test-poll3 robot-ch-test-poll3 added pr-performance Pull request with some performance improvements and removed pr-not-for-changelog This PR should not be mentioned in the changelog labels Feb 6, 2023
@KochetovNicolai
Copy link
Copy Markdown
Member

Tried to test a bit
https://pastila.nl/?00c4e0d0/ab45a5be5e2e7e96e7642936638c66d2

First

Before

SELECT
    merged_from,
    part_name,
    part_type,
    merge_algorithm,
    formatReadableSize(peak_memory_usage)
FROM system.part_log
WHERE (table = 'tvm') AND (event_type = 'MergeParts')

Query id: 8d4989ba-7811-44e7-937b-6bb5c70399ea

(wide + compact -> wide)

┌─merged_from───────────────┬─part_name─┬─part_type─┬─merge_algorithm─┬─formatReadableSize(peak_memory_usage)─┐
│ ['all_1_3_2','all_4_4_0'] │ all_1_4_3 │ Wide      │ Horizontal      │ 1.05 GiB                              │
└───────────────────────────┴───────────┴───────────┴─────────────────┴───────────────────────────────────────┘
(3 compact -> wide)
┌─merged_from───────────────────────────┬─part_name─┬─part_type─┬─merge_algorithm─┬─formatReadableSize(peak_memory_usage)─┐
│ ['all_1_1_1','all_2_2_0','all_3_3_0'] │ all_1_3_2 │ Wide      │ Horizontal      │ 812.98 MiB                            │
└───────────────────────────────────────┴───────────┴───────────┴─────────────────┴───────────────────────────────────────┘

After


SELECT
    merged_from,
    part_name,
    part_type,
    merge_algorithm,
    formatReadableSize(peak_memory_usage)
FROM system.part_log
WHERE (table = 'tvm') AND (event_type = 'MergeParts')

Query id: 0f2561a3-8d5e-4dfb-b56d-93d9ebd05188

┌─merged_from───────────────────────────┬─part_name─┬─part_type─┬─merge_algorithm─┬─formatReadableSize(peak_memory_usage)─┐
│ ['all_1_1_0','all_2_2_0','all_3_3_0'] │ all_1_3_1 │ Wide      │ Vertical        │ 675.00 MiB                            │
└───────────────────────────────────────┴───────────┴───────────┴─────────────────┴───────────────────────────────────────┘
┌─merged_from───────────────┬─part_name─┬─part_type─┬─merge_algorithm─┬─formatReadableSize(peak_memory_usage)─┐
│ ['all_1_3_1','all_4_4_0'] │ all_1_4_2 │ Wide      │ Vertical        │ 667.39 MiB                            │
└───────────────────────────┴───────────┴───────────┴─────────────────┴───────────────────────────────────────┘

Second

Before

SELECT
    merged_from,
    part_name,
    part_type,
    merge_algorithm,
    formatReadableSize(peak_memory_usage)
FROM system.part_log
WHERE (table = 'tvm') AND (event_type = 'MergeParts')

(1 wide + 5 compact -> wide)
┌─merged_from─────────────────────────────────────────────────────────────────┬─part_name──┬─part_type─┬─merge_algorithm─┬─formatReadableSize(peak_memory_usage)─┐
│ ['all_1_1_0','all_2_6_1','all_7_7_0','all_8_8_0','all_9_9_0','all_10_10_0'] │ all_1_10_2 │ Wide      │ Horizontal      │ 1.05 GiB                              │
└─────────────────────────────────────────────────────────────────────────────┴────────────┴───────────┴─────────────────┴───────────────────────────────────────┘

After

SELECT
    merged_from,
    part_name,
    part_type,
    merge_algorithm,
    formatReadableSize(peak_memory_usage)
FROM system.part_log
WHERE (table = 'tvm') AND (event_type = 'MergeParts')

Query id: dece7b8f-15aa-4137-810c-3db5c69b200c

┌─merged_from─────────────────────────────────────────────────────────────────┬─part_name──┬─part_type─┬─merge_algorithm─┬─formatReadableSize(peak_memory_usage)─┐
│ ['all_1_1_0','all_2_6_1','all_7_7_0','all_8_8_0','all_9_9_0','all_10_10_0'] │ all_1_10_2 │ Wide      │ Vertical        │ 656.48 MiB                            │
└─────────────────────────────────────────────────────────────────────────────┴────────────┴───────────┴─────────────────┴───────────────────────────────────────┘

Looks like ~30% less memory usage for (wide + a few compact) parts.

@KochetovNicolai
Copy link
Copy Markdown
Member

Interestingly that for a case of 2 wide parts merge required 674.98 MiB. Looks too much.

@KochetovNicolai KochetovNicolai merged commit 3912f5a into ClickHouse:master Feb 7, 2023
@KochetovNicolai KochetovNicolai added the pr-must-backport Pull request should be backported intentionally. Use this label with great care! label Feb 7, 2023
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Feb 9, 2023
@CurtizJ
Copy link
Copy Markdown
Member Author

CurtizJ commented Feb 9, 2023

Seems that it breaks the rolling upgrade. Will reupload under a settings.

@alexey-milovidov
Copy link
Copy Markdown
Member

Interesting, how exactly?

@CurtizJ
Copy link
Copy Markdown
Member Author

CurtizJ commented Feb 10, 2023

@alexey-milovidov

The way how index_granularity is computed differs in vertical and horizontal merges. That leads to CHECKSUM_DOESNT_MATCH error.

@alexey-milovidov
Copy link
Copy Markdown
Member

CHECKSUM_DOESNT_MATCH is not an incompatibility, it is a normal situation during upgrading.

@CurtizJ
Copy link
Copy Markdown
Member Author

CurtizJ commented Feb 10, 2023

Also I'm afraid that this change will trigger LOGICAL_ERROR on old version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-performance Pull request with some performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make it possible to use Vertical merge algorithm together with parts of Compact type

7 participants