Skip to content

Fix excessive number of files (leads to excessive memory usage) for external ORDER BY#80777

Merged
azat merged 6 commits intoClickHouse:masterfrom
azat:fix-external-sort
May 28, 2025
Merged

Fix excessive number of files (leads to excessive memory usage) for external ORDER BY#80777
azat merged 6 commits intoClickHouse:masterfrom
azat:fix-external-sort

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented May 24, 2025

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Fix excessive number of files (leads to excessive memory usage) for external ORDER BY

In #72598 the behavior of max_bytes_before_external_sort has been changed, it has been switched to rely on the total memory usage of the query instead of the size of a sorting data.

But, the problem with this approach is that you can have a lot of memory used outside of sorting (FINAL, aggregation), and spilling sorting blocks to disk will not help reduce memory usage obviously, but it will make things even worse, because it will create lots of small files, that will require a lot of memory to merge them.

So this patch reverts the behavior of max_bytes_before_external_sort back, now it will rely on the sorting block size.

Fixes: #72598 (cc @alexey-milovidov )

…xternal ORDER BY

In ClickHouse#72598 the behavior of `max_bytes_before_external_sort` has been
changed, it has been switched to rely on the total memory usage of the
query instead of the size of a sorting data.

But, the problem with this approach is that you can have a lot of memory
used outside of sorting (FINAL, aggregation), and spilling sorting
blocks to disk will not help reduce memory usage obviously, but it will
make things even worse, because it will create lots of small files, that
will require a lot of memory to merge them.

So this patch reverts the behavior of `max_bytes_before_external_sort`
back, now it will rely on the sorting block size.
@azat azat added the pr-must-backport Pull request should be backported intentionally. Use this label with great care! label May 24, 2025
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented May 24, 2025

Workflow [PR], commit [27f1ccb]

@clickhouse-gh clickhouse-gh bot added the pr-bugfix Pull request with bugfix, not backported by default label May 24, 2025
azat added 3 commits May 25, 2025 13:44
Set max_bytes_before_external_sort=min_external_sort_block_bytes=100Mi
The problem is that now both conditions should be met:
- max_bytes_ratio_before_external_sort
- max_bytes_before_external_sort

So it is not possible to use max_bytes_ratio_before_external_sort=0.9,
this will lead to MEMORY_LIMIT_EXCEEDED errors
@Michicosun Michicosun self-assigned this May 26, 2025
@azat azat added this pull request to the merge queue May 28, 2025
Merged via the queue into ClickHouse:master with commit dd15c52 May 28, 2025
120 checks passed
@azat azat deleted the fix-external-sort branch May 28, 2025 11:01
robot-clickhouse-ci-1 added a commit that referenced this pull request May 28, 2025
Cherry pick #80777 to 25.5: Fix excessive number of files (leads to excessive memory usage) for external ORDER BY
robot-clickhouse added a commit that referenced this pull request May 28, 2025
…ssive memory usage) for external ORDER BY
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-synced-to-cloud The PR is synced to the cloud repo label May 28, 2025
azat added a commit that referenced this pull request May 28, 2025
Backport #80777 to 25.5: Fix excessive number of files (leads to excessive memory usage) for external ORDER BY
robot-clickhouse-ci-2 added a commit that referenced this pull request May 28, 2025
Cherry pick #80777 to 25.3: Fix excessive number of files (leads to excessive memory usage) for external ORDER BY
robot-clickhouse added a commit that referenced this pull request May 28, 2025
…ssive memory usage) for external ORDER BY
robot-clickhouse-ci-2 added a commit that referenced this pull request May 28, 2025
Cherry pick #80777 to 25.4: Fix excessive number of files (leads to excessive memory usage) for external ORDER BY
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 9, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 10, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 11, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 12, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
baibaichen added a commit to Kyligence/ClickHouse that referenced this pull request Jun 13, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 13, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 14, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 15, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 16, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 17, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 18, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 19, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 20, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 21, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 22, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 23, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 24, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
baibaichen added a commit to Kyligence/ClickHouse that referenced this pull request Jun 26, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
baibaichen added a commit to Kyligence/ClickHouse that referenced this pull request Jun 27, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
baibaichen added a commit to Kyligence/ClickHouse that referenced this pull request Jun 28, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
baibaichen added a commit to Kyligence/ClickHouse that referenced this pull request Jun 29, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jun 29, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
@robot-clickhouse robot-clickhouse added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Jul 2, 2025
baibaichen added a commit to Kyligence/ClickHouse that referenced this pull request Jul 3, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
baibaichen added a commit to Kyligence/ClickHouse that referenced this pull request Jul 4, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
baibaichen added a commit to Kyligence/ClickHouse that referenced this pull request Jul 5, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 6, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
baibaichen added a commit to Kyligence/ClickHouse that referenced this pull request Jul 7, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 7, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
kyligence-git pushed a commit to Kyligence/ClickHouse that referenced this pull request Jul 8, 2025
Fix rebase issue:
- 20250529 ClickHouse#80777
- 20250326 ClickHouse#77729
- 20250308 ClickHouse#77269
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-backports-created-cloud deprecated label, NOOP pr-bugfix Pull request with bugfix, not backported by default pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants