Skip to content

Fix leaking of MergesMutationsMemoryTracking and fix query_views_log for streaming#86422

Merged
azat merged 2 commits intoClickHouse:masterfrom
azat:fix-merges-mutations-memory-tracking
Sep 1, 2025
Merged

Fix leaking of MergesMutationsMemoryTracking and fix query_views_log for streaming#86422
azat merged 2 commits intoClickHouse:masterfrom
azat:fix-merges-mutations-memory-tracking

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Aug 29, 2025

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix leaking of MergesMutationsMemoryTracking due to Buffer tables and fix query_views_log for streaming from Kafka (and others)

For Buffer we started to use background_memory_tracker since #79963, but forgot to call background_memory_tracker.adjustOnBackgroundTaskEnd(), like we do for merges/mutations and this leads to drift in MergesMutationsMemoryTracking (significant one), which later will lead to TOO_MANY_PARTS, because merges will stop.

At first I actually added it (adjustOnBackgroundTaskEnd()) but later decided that it is wrong, merges/mutations should not take care about Buffer flushes. Instead I've just created separate MemoryTracker with total_memory_tracker as parent, that does not requires such hacks. And this also fixes query_views_log for streaming from Kafka (and others), since previously there was no ThreadGroup for them.

Note, maybe it will be a good idea to have a RAII for return value of createForBackgroundProcess() that will take care of adjustOnBackgroundTaskEnd(), but but this is not can be tricky since you need to make sure that there are no users of it (i.e. ThreadGroupSwitcher already finished).

Fixes: #79963 (cc @CheSema )
Fixes: #86398

This place forgot to call
background_memory_tracker.adjustOnBackgroundTaskEnd(), like we do for
merges/mutations and this may lead to significant drift in
MergesMutationsMemoryTracking, which later will lead to TOO_MANY_PARTS,
because merges will stop.

At first I actually added it (adjustOnBackgroundTaskEnd()) but later
decided that it is wrong, merges/mutations should not take care about
Buffer flushes.

Note, maybe it will be a good idea to have a RAII for return value of
createForBackgroundProcess() that will take care of
adjustOnBackgroundTaskEnd(), but but this is not can be tricky since you
need to make sure that there are no users of it (i.e.
ThreadGroupSwitcher already finished).
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Aug 29, 2025

Workflow [PR], commit [f8952a0]

Summary:

job_name test_name status info comment
Integration tests (amd_binary, 3/5) failure
test_storage_s3_queue/test_4.py::test_list_and_delete_race FAIL
Stress test (amd_debug) failure
Server died FAIL
Hung check failed, possible deadlock found (see hung_check.log) FAIL
Killed by signal (in clickhouse-server.log) FAIL
Fatal message in clickhouse-server.log (see fatal_messages.txt) FAIL
Killed by signal (output files) FAIL
Found signal in gdb.log FAIL
Upgrade check (amd_asan) failure
Killed by signal (in clickhouse-server.log) FAIL
Fatal message in clickhouse-server.log (see fatal_messages.txt) FAIL
Killed by signal (output files) FAIL
Found signal in gdb.log FAIL
Upgrade check (amd_tsan) failure
Killed by signal (in clickhouse-server.log) FAIL
Fatal message in clickhouse-server.log (see fatal_messages.txt) FAIL
Killed by signal (output files) FAIL
Found signal in gdb.log FAIL
Upgrade check (amd_msan) failure
Killed by signal (in clickhouse-server.log) FAIL
Fatal message in clickhouse-server.log (see fatal_messages.txt) FAIL
Killed by signal (output files) FAIL
Found signal in gdb.log FAIL
Finish Workflow failure
python3 ./ci/jobs/scripts/workflow_hooks/new_tests_check.py failure

@clickhouse-gh clickhouse-gh bot added pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-cloud labels Aug 29, 2025
@azat
Copy link
Copy Markdown
Member Author

azat commented Aug 29, 2025

This will break system.query_views_log for Buffer flushes

@azat azat marked this pull request as draft August 29, 2025 21:45
@azat azat added pr-bugfix Pull request with bugfix, not backported by default and removed pr-critical-bugfix labels Aug 30, 2025
@azat azat force-pushed the fix-merges-mutations-memory-tracking branch from 6c30827 to af1af48 Compare August 30, 2025 07:00
@azat azat marked this pull request as ready for review August 30, 2025 13:37
It has been removed in the previous patch from
StorageBuffer::backgroundFlush(), but, it is required for
query_views_log.

So let's get it back, but in a more generic fashion.
@azat azat force-pushed the fix-merges-mutations-memory-tracking branch from af1af48 to f8952a0 Compare August 30, 2025 14:13
@azat azat requested a review from CheSema September 1, 2025 08:49
@azat azat changed the title Fix leaking of MergesMutationsMemoryTracking due to Buffer tables Fix leaking of MergesMutationsMemoryTracking and fix query_views_log for streaming Sep 1, 2025
@CheSema CheSema self-assigned this Sep 1, 2025
{
try
{
auto thread_group = ThreadGroup::createForBackgroundProcess(getContext());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You wrote

/// For background tasks (i.e. Buffer flush) there may not be any group

But here thread group has been create for buffer flush. It did not match.

Copy link
Copy Markdown
Member

@CheSema CheSema left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any way, with that PR you create thread group for all the cases when there have not been thread group for current thread created yet.

@azat
Copy link
Copy Markdown
Member Author

azat commented Sep 1, 2025

CI known and unrelated.

@azat azat enabled auto-merge September 1, 2025 10:40
@azat azat added this pull request to the merge queue Sep 1, 2025
Merged via the queue into ClickHouse:master with commit d831414 Sep 1, 2025
115 of 122 checks passed
@azat azat deleted the fix-merges-mutations-memory-tracking branch September 1, 2025 11:00
@robot-ch-test-poll2 robot-ch-test-poll2 added pr-backports-created-cloud deprecated label, NOOP pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels Sep 1, 2025
robot-clickhouse added a commit that referenced this pull request Sep 1, 2025
Cherry pick #86422 to 25.6: Fix leaking of MergesMutationsMemoryTracking and fix query_views_log for streaming
robot-clickhouse added a commit that referenced this pull request Sep 1, 2025
robot-clickhouse added a commit that referenced this pull request Sep 1, 2025
Cherry pick #86422 to 25.7: Fix leaking of MergesMutationsMemoryTracking and fix query_views_log for streaming
robot-clickhouse added a commit that referenced this pull request Sep 1, 2025
robot-clickhouse added a commit that referenced this pull request Sep 1, 2025
Cherry pick #86422 to 25.8: Fix leaking of MergesMutationsMemoryTracking and fix query_views_log for streaming
robot-clickhouse added a commit that referenced this pull request Sep 1, 2025
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Sep 1, 2025
@robot-ch-test-poll3 robot-ch-test-poll3 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Sep 1, 2025
clickhouse-gh bot added a commit that referenced this pull request Sep 1, 2025
Backport #86422 to 25.6: Fix leaking of MergesMutationsMemoryTracking and fix query_views_log for streaming
clickhouse-gh bot added a commit that referenced this pull request Sep 1, 2025
Backport #86422 to 25.7: Fix leaking of MergesMutationsMemoryTracking and fix query_views_log for streaming
clickhouse-gh bot added a commit that referenced this pull request Sep 1, 2025
Backport #86422 to 25.8: Fix leaking of MergesMutationsMemoryTracking and fix query_views_log for streaming
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-backports-created-cloud deprecated label, NOOP pr-bugfix Pull request with bugfix, not backported by default pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

query_views_log Stops Recording After Upgrade from 25.1 to 25.7

5 participants