Fix TSan "unlock of an unlocked mutex" in CHECK TABLE for ReplicatedMergeTree#94541
Merged
alexey-milovidov merged 1 commit intomasterfrom Jan 19, 2026
Merged
Conversation
…ergeTree The `DataValidationTasks` struct stored a `std::unique_lock<std::mutex>` holding a lock on `BackgroundSchedulePoolTaskInfo::exec_mutex`. This caused a ThreadSanitizer warning because: 1. Thread A acquires the lock via `pausePartsCheck()` -> `getExecLock()` 2. `DataValidationTasks` is stored in a `shared_ptr` and copied to worker threads during parallel CHECK TABLE execution (via `TableCheckTask`) 3. When the last reference is destroyed (potentially on Thread B), the `unique_lock` destructor tries to unlock the mutex 4. pthread mutexes must be unlocked by the same thread that locked them, causing the TSan error "unlock of an unlocked mutex (or by a wrong thread)" The fix replaces `pausePartsCheck()` (which returned a `unique_lock`) with `temporaryPause()` which returns a new `TemporaryPause` RAII guard. This guard uses `deactivate()` to pause the background task and `activateAndSchedule()` in its destructor to resume it. Both operations are thread-safe and can be called from any thread, eliminating the thread-affinity issue. Changes: - Add `getTaskInfoPtr()` to `BackgroundSchedulePoolTaskHolder` - Add `TemporaryPause` class to `ReplicatedMergeTreePartCheckThread` - Remove `pausePartsCheck()` and replace all usages with `temporaryPause()` - Update `DataValidationTasks` to use `TemporaryPause` instead of `unique_lock` Fixes #87916 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Contributor
|
Workflow [PR], commit [b3f0dd0] Summary: ❌
|
azat
added a commit
to azat/ClickHouse
that referenced
this pull request
Jan 19, 2026
azat
added a commit
to azat/ClickHouse
that referenced
this pull request
Jan 19, 2026
antonio2368
reviewed
Jan 21, 2026
| /// deactivate() waits for any running task execution to finish | ||
| /// and prevents new executions from starting. | ||
| /// This is safe to call from any thread. | ||
| task->deactivate(); |
Member
There was a problem hiding this comment.
I don't think this is same and works as intendend.
ReplicatedMergeTreePartCheckThread::run calls task->schedule which will instantly add the task again before releasing the exec_mutex.
https://github.com/ClickHouse/ClickHouse/blob/master/src/Core/BackgroundSchedulePool.cpp#L213
This is causing issues in CI.
Member
There was a problem hiding this comment.
Seems like it does check for deactivated before proceeding with execution of function 🤔
Member
There was a problem hiding this comment.
Okay, I think I figured it out.
If we have 2 threads executing DROP_RANGE, it will activate the task when it finishes, even though the second DROP_RANGE still didn't finish.
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #87916
Changelog category (leave one):