Fix heap-use-after-free in MergeTreeReadTask::createReaders by alexey-milovidov · Pull Request #99483 · ClickHouse/ClickHouse

alexey-milovidov · 2026-03-14T02:31:38Z

Summary

Fix heap-use-after-free where MergeTreeReadPoolBase accesses storage through data_part->storage (a bare const MergeTreeData & reference) after the storage has been destroyed.

Querying checks.test_context_raw on ClickHouse Playground reveals 19 occurrences of the same root cause in the last 90 days across two distinct crash patterns:

Pattern 1 — LoadedMergeTreeDataPartInfoForReader constructor (3 occurrences): During query execution, createReaders constructs LoadedMergeTreeDataPartInfoForReader which calls data_part->storage.getContext(). If the storage is already freed, this is a use-after-free.

Pattern 2 — IMergeTreeDataPart::clearCaches during part destruction (16 occurrences): When the read pool is destroyed, data parts are released. Part destructors call removeIfNeeded() → clearCaches() → storage.getContext() to clear mark/uncompressed caches. If the storage is freed before parts, this crashes.

Root cause: The query pipeline normally holds a StoragePtr via QueryPlanResourceHolder, but the read pool itself does not hold a storage reference. If any code path fails to set up addStorageHolder at the pipeline level (or if the pipeline resources are released before the pool is destroyed), the storage can be freed while parts still reference it.

Fix: Store a ConstStoragePtr (via shared_from_this) in MergeTreeReadPoolBase. It is declared as the first data member so it is destroyed last, guaranteeing the storage outlives all data parts held by the pool (parts_ranges, per_part_infos). This fixes both crash patterns.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=022e467a014b9b3f1fc993735bc4f6b3974e59a0&name_0=MasterCI&name_1=Stress%20test%20%28arm_asan%2C%20s3%29

Test plan

CI stress tests with ASan/TSan should be clean (covers both patterns)

Changelog category (leave one):

Critical Bug Fix (crash, data loss, RBAC)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix heap-use-after-free when a table is dropped concurrently with a running read query (19 occurrences in CI over the last 90 days).

🤖 Generated with Claude Code

`MergeTreeReadTask::createReaders` accessed `data_part->storage.getContext()` (via `LoadedMergeTreeDataPartInfoForReader` constructor) and `data_part->storage.getSettings()` through a potentially dangling reference. When a table is dropped by `DatabaseCatalog::dropTablesParallel` while a query is still reading from it, the `StorageMergeTree` object is destroyed, but the data parts held by the query still reference it via a bare `const MergeTreeData & storage`. Accessing this dangling reference causes a heap-use-after-free. The fix captures the `ContextPtr` and `MergeTreeSettingsPtr` during read pool construction (when the storage is guaranteed to be alive) and passes them via `MergeTreeReadTask::Extras` to avoid accessing `data_part->storage` during query execution. https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=022e467a014b9b3f1fc993735bc4f6b3974e59a0&name_0=MasterCI&name_1=Stress%20test%20%28arm_asan%2C%20s3%29 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

clickhouse-gh · 2026-03-14T02:33:06Z

Workflow [PR], commit [6cf20e8]

Summary: ✅

AI Review

Summary

This PR fixes a heap-use-after-free in MergeTree read teardown when enable_shared_storage_snapshot_in_query=0 by preserving storage lifetime in stripped snapshots instead of dropping all snapshot data. It also adds targeted regression coverage, including the projection-index path that dereferences data_part->storage during reader creation. Based on the current diff, I did not find new high-confidence correctness, safety, or performance issues to block merge.

ClickHouse Rules

Item	Status	Notes
Deletion logging	➖
Serialization versioning	➖
Core-area scrutiny	✅
No test removal	✅
Experimental gate	➖
No magic constants	✅
Backward compatibility	✅
`SettingsChangesHistory.cpp`	➖
PR metadata quality	✅
Safe rollout	✅
Compilation time	✅

Final Verdict

Status: ✅ Approve

…uction The `MergeTreeReadPoolBase` accesses storage through `data_part->storage` (a bare reference) during query execution. If the storage is destroyed concurrently (e.g. by `DatabaseCatalog::dropTablesParallel`), this becomes a dangling reference, causing a heap-use-after-free. The query pipeline normally holds a `StoragePtr` via `QueryPlanResourceHolder`, but certain code paths (e.g. backup restore) may not set this up correctly, allowing the storage to be freed. The fix stores a `ConstStoragePtr` (via `shared_from_this`) in `MergeTreeReadPoolBase` during construction, guaranteeing the storage outlives the read pool regardless of the pipeline's resource management. https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=022e467a014b9b3f1fc993735bc4f6b3974e59a0&name_0=MasterCI&name_1=Stress%20test%20%28arm_asan%2C%20s3%29 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Move owned_data_storage before parts_ranges and per_part_infos to ensure it is destroyed after them. Part destructors call clearCaches() -> storage.getContext(), which requires the storage to still be alive. With the previous ordering, owned_data_storage could be destroyed before parts_ranges, causing use-after-free in part destructors. This addresses both crash patterns found in CI: - Pattern 1 (3 occurrences): LoadedMergeTreeDataPartInfoForReader constructor accessing storage.getContext() during query execution - Pattern 2 (16 occurrences): IMergeTreeDataPart::clearCaches() accessing storage.getContext() during part destruction in the read pool destructor Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

alexey-milovidov · 2026-03-14T04:38:14Z

Root cause analysis

Querying checks.test_context_raw on ClickHouse Playground reveals 19 occurrences of the same root cause in the last 90 days across two distinct crash patterns:

Pattern 1 — LoadedMergeTreeDataPartInfoForReader constructor (3 occurrences): During query execution, createReaders constructs LoadedMergeTreeDataPartInfoForReader which calls data_part->storage.getContext() on a freed storage.

Pattern 2 — IMergeTreeDataPart::clearCaches during part destruction (16 occurrences): When parts are destroyed, their destructors call clearCaches() → storage.getContext() on a freed storage. This is the dominant pattern.

Destruction chain (from CI ASan traces)

Both the "freed by" and the crash happen on the same thread — this is not a concurrency race, it's a sequential destruction order issue.

The ASan traces from stateless tests (distributed plan) show the crash happens inside SnapshotData::~SnapshotData, where SnapshotData::parts (RangesInDataPartsPtr, a shared_ptr) releases the last reference to parts:

StorageSnapshot::~StorageSnapshot
  → SnapshotData::~SnapshotData
    → ~shared_ptr<RangesInDataParts>     ← parts destroyed HERE
      → ~RangesInDataPart → ~MergeTreeDataPartCompact
        → clearCaches() → storage.getContext()   ← CRASH

And the "freed by" trace (same thread, earlier in the chain) shows the storage was freed by a different SnapshotData:

SnapshotData::~SnapshotData
  → ~shared_ptr<IStorage const>          ← storage freed

Root cause

There are two StorageSnapshot objects with different SnapshotData instances. One has SnapshotData::storage set (holds ConstStoragePtr), the other has it null (created via getStorageSnapshotWithoutData at MergeTreeData.cpp:10483-10484, which default-initializes SnapshotData without calling shared_from_this()). But both can share the same RangesInDataPartsPtr parts (since it's a shared_ptr).

During destruction on the same thread:

SnapshotData A destroyed: parts released (shared, refcount still > 0), then storage released → storage freed (was the last ConstStoragePtr)
SnapshotData B destroyed: parts released (last ref) → parts destroyed → clearCaches() → storage.getContext() → CRASH (storage already freed)

Fix in this PR

The fix stores a ConstStoragePtr (via shared_from_this) as the first member of MergeTreeReadPoolBase, ensuring it is destroyed last — after all data parts held by the pool. This guarantees the storage outlives all parts regardless of the snapshot lifecycle or pipeline destruction order.

A deeper fix would be to always set SnapshotData::storage in getStorageSnapshotWithoutData as well, but that has a broader scope.

Preserve when removes parts from the , and make retain the actual data storage instead of blindly retaining . This covers wrapper storages such as and prevents from observing dangling during read-pool teardown. CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=022e467a014b9b3f1fc993735bc4f6b3974e59a0&name_0=MasterCI&name_1=Stress%20test%20%28arm_asan%2C%20s3%29

alexey-milovidov · 2026-03-14T05:25:51Z

Further investigation by Codex GPT 5.4

I found the missing explanation for Mechanism B.

ReadFromMergeTree::initializePipeline can execute merge_tree_enable_remove_parts_from_snapshot_optimization and replace storage_snapshot->data with a new MergeTreeData::SnapshotData. That destroys the original SnapshotData::storage before the read pools are torn down. So the pool can still hold a live StorageSnapshotPtr, but its data payload no longer keeps the underlying MergeTreeData alive, and later part teardown reaches clearCaches -> data_part->storage.getContext() with a dangling storage.

I pushed commit b14fa7f4a10 with two hardenings:

preserve SnapshotData::storage when stripping SnapshotData::parts in ReadFromMergeTree;
make MergeTreeReadPoolBase retain MergeTreeData::SnapshotData::storage or the parts' own storage, not blindly storage_snapshot->storage.

The second point matters for wrapper storages such as StorageFromMergeTreeProjection, where storage_snapshot->storage is not the same object as data_part->storage.

I also ran direct -fsyntax-only compiles for ReadFromMergeTree.cpp and MergeTreeReadPoolBase.cpp; both passed.

The test was waiting for `MergeTreeReadersChain` to appear in the `--server_logs_file`, but `WriteBufferFromFile` uses a 1 MB buffer (`DBMS_DEFAULT_BUFFER_SIZE`) that is only flushed when the query finishes or the buffer fills up. So `grep` on the file while the query is running never sees the text, and the test times out. Replace the log-file-based synchronization with a check on `system.processes.read_rows > 0`, which reliably detects that the query has started reading (and therefore has created readers that hold data parts). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

clickhouse-gh · 2026-03-16T03:30:12Z

+        --enable_shared_storage_snapshot_in_query 0 \
+        --merge_tree_read_split_ranges_into_intersecting_and_non_intersecting_injection_probability 0 \
+        --query "
+        SELECT sleepEachRow(0.01)


⚠️ This regression test validates the regular read-pool teardown path, but it does not force the projection-index read-pool path where MergeTreeReadTask::createReaders still dereferences data_part->storage.getSettings(). Please add/adjust coverage to exercise projection-index reads under teardown race as well.

Exercise `MergeTreeReadPoolProjectionIndex` during teardown to cover the `data_part->storage.getSettings()` dereference in `createReaders` on projection parts. #99483 (review) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…teardown_race` The test checked for unexpected exceptions by grepping for "Exception" in stderr, then excluding lines containing "QUERY_WAS_CANCELLED". However, the `KILL QUERY` stack trace includes lines like `DB::Exception::Exception` that do not contain the string "QUERY_WAS_CANCELLED", causing false positives. Use `grep -F "Code:"` instead, which matches actual error message lines (e.g. `Code: 394. DB::Exception: ...`) rather than stack trace class names. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

clickhouse-gh · 2026-03-16T05:42:59Z

@@ -0,0 +1,2 @@
+- `ReadFromMergeTree::initializePipeline` can replace `storage_snapshot->data` during `merge_tree_enable_remove_parts_from_snapshot_optimization`; when stripping `SnapshotData::parts`, preserve `SnapshotData::storage` or `IMergeTreeDataPart::clearCaches` can observe a dangling `data_part->storage`.


⚠️ This file looks like local reviewer/assistant state rather than product code. Please remove .claude/learnings.md from the PR before merge to avoid committing tool-specific internal notes into the repository history.

This looks related

azat

The part in ReadFromMergeTree looks good
Until #95612 will be resolved

azat · 2026-03-16T15:11:51Z

@@ -0,0 +1,2 @@
+- `ReadFromMergeTree::initializePipeline` can replace `storage_snapshot->data` during `merge_tree_enable_remove_parts_from_snapshot_optimization`; when stripping `SnapshotData::parts`, preserve `SnapshotData::storage` or `IMergeTreeDataPart::clearCaches` can observe a dangling `data_part->storage`.
+- `MergeTreeReadPoolBase` must retain the storage referenced by `MergeTreeData::SnapshotData::storage` or the parts themselves, not blindly `storage_snapshot->storage`, because wrappers such as `StorageFromMergeTreeProjection` are not the same object as `data_part->storage`.


StorageFromMergeTreeProjection has parent_storage which points to the MergeTreeData

azat · 2026-03-16T15:14:39Z

+namespace
+{
+
+ConstStoragePtr getOwnedDataStorage(const StorageSnapshotPtr & storage_snapshot, const RangesInDataParts * parts_ranges = nullptr)


This looks hacky, are you sure that we need this, after the fix for enable_shared_storage_snapshot_in_query == false?

azat · 2026-03-16T15:15:05Z

@@ -0,0 +1,2 @@
+- `ReadFromMergeTree::initializePipeline` can replace `storage_snapshot->data` during `merge_tree_enable_remove_parts_from_snapshot_optimization`; when stripping `SnapshotData::parts`, preserve `SnapshotData::storage` or `IMergeTreeDataPart::clearCaches` can observe a dangling `data_part->storage`.


This looks related

The `ReadFromMergeTree.cpp` fix already preserves `SnapshotData::storage` when stripping the snapshot, so the storage is kept alive via `storage_snapshot` which outlives `parts_ranges` and `per_part_infos` due to member declaration order. The extra `owned_data_storage` member is redundant. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

alexey-milovidov · 2026-03-16T22:19:37Z

Ok. Now the change is minimal. And it will be amazing if we finally nail it!

…free-merge-tree-read-task

The test used `today()` for the non-expired row, but with randomized `session_timezone` settings (e.g. UTC-7), the Date value stored can be yesterday in UTC terms, making the TTL `age + 1 day` expire when the server evaluates it in UTC. Use a far-future date instead. https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=99483&sha=bc5fcd5658a549c3eb59b2d9c27fb4b9b2fc966c&name_0=PR Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

clickhouse-gh · 2026-03-24T18:48:40Z

LLVM Coverage Report

Metric	Baseline	Current	Δ
Lines	84.10%	84.10%	+0.00%
Functions	24.50%	24.50%	+0.00%
Branches	76.60%	76.70%	+0.10%

PR changed lines: PR changed-lines coverage: 100.00% (8/8, 0 noise lines excluded)
Diff coverage report
Uncovered code

…::createReaders

Backport #99483 to 26.3: Fix heap-use-after-free in MergeTreeReadTask::createReaders

Backport #99483 to 26.1: Fix heap-use-after-free in MergeTreeReadTask::createReaders

Backport #99483 to 26.2: Fix heap-use-after-free in MergeTreeReadTask::createReaders

…r-free-merge-tree-read-task Fix heap-use-after-free in MergeTreeReadTask::createReaders Signed-off-by: Ilya Golshtein <igolshtein@altinity.com>

clickhouse-gh Bot reviewed Mar 14, 2026

View reviewed changes

Comment thread src/Storages/MergeTree/MergeTreeReadPoolBase.cpp Outdated

alexey-milovidov and others added 2 commits March 13, 2026 20:17

alexey-milovidov requested a review from azat March 14, 2026 04:39

Add ReadFromMergeTree snapshot teardown race test

8374123

Algunenano added pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-critical-bugfix labels Mar 15, 2026

alexey-milovidov and others added 2 commits March 16, 2026 02:43

Merge master into fix-heap-use-after-free-merge-tree-read-task

f322e2b

clickhouse-gh Bot reviewed Mar 16, 2026

View reviewed changes

alexey-milovidov and others added 2 commits March 16, 2026 03:40

clickhouse-gh Bot reviewed Mar 16, 2026

View reviewed changes

azat self-assigned this Mar 16, 2026

azat reviewed Mar 16, 2026

View reviewed changes

alexey-milovidov and others added 2 commits March 17, 2026 00:19

Merge remote-tracking branch 'origin/master' into fix-heap-use-after-…

a30bfc7

…free-merge-tree-read-task

azat approved these changes Mar 17, 2026

View reviewed changes

alexey-milovidov added 2 commits March 17, 2026 06:31

Merge branch 'master' into fix-heap-use-after-free-merge-tree-read-task

c8aa1a2

Merge branch 'master' into fix-heap-use-after-free-merge-tree-read-task

4092f6f

alexey-milovidov mentioned this pull request Mar 18, 2026

Fix broken docker pull retry in integration tests #99828

Merged

1 task

alexey-milovidov added 2 commits March 18, 2026 11:54

Merge master into fix-heap-use-after-free-merge-tree-read-task

adea655

Merge master into fix-heap-use-after-free-merge-tree-read-task

38b352a

alexey-milovidov merged commit 9f827a6 into master Mar 26, 2026
151 of 152 checks passed

alexey-milovidov deleted the fix-heap-use-after-free-merge-tree-read-task branch March 26, 2026 22:55

robot-clickhouse added pr-synced-to-cloud The PR is synced to the cloud repo pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels Mar 26, 2026

robot-ch-test-poll1 mentioned this pull request Mar 26, 2026

Cherry pick #99483 to 25.8: Fix heap-use-after-free in MergeTreeReadTask::createReaders #100850

Merged

robot-clickhouse added a commit that referenced this pull request Mar 26, 2026

Backport #99483 to 25.8: Fix heap-use-after-free in MergeTreeReadTask…

80148e4

…::createReaders

This was referenced Mar 26, 2026

Backport #99483 to 25.8: Fix heap-use-after-free in MergeTreeReadTask::createReaders #100851

Closed

Cherry pick #99483 to 26.1: Fix heap-use-after-free in MergeTreeReadTask::createReaders #100852

Merged

robot-clickhouse added a commit that referenced this pull request Mar 26, 2026

Backport #99483 to 26.1: Fix heap-use-after-free in MergeTreeReadTask…

3271db5

…::createReaders

This was referenced Mar 26, 2026

Backport #99483 to 26.1: Fix heap-use-after-free in MergeTreeReadTask::createReaders #100853

Merged

Cherry pick #99483 to 26.2: Fix heap-use-after-free in MergeTreeReadTask::createReaders #100854

Merged

robot-ch-test-poll1 mentioned this pull request Mar 26, 2026

Backport #99483 to 26.2: Fix heap-use-after-free in MergeTreeReadTask::createReaders #100855

Merged

robot-clickhouse added a commit that referenced this pull request Mar 26, 2026

Backport #99483 to 26.2: Fix heap-use-after-free in MergeTreeReadTask…

462f13b

…::createReaders

robot-ch-test-poll1 mentioned this pull request Mar 26, 2026

Cherry pick #99483 to 26.3: Fix heap-use-after-free in MergeTreeReadTask::createReaders #100856

Merged

robot-clickhouse added a commit that referenced this pull request Mar 26, 2026

Backport #99483 to 26.3: Fix heap-use-after-free in MergeTreeReadTask…

469636f

…::createReaders

robot-ch-test-poll1 mentioned this pull request Mar 26, 2026

Backport #99483 to 26.3: Fix heap-use-after-free in MergeTreeReadTask::createReaders #100857

Merged

robot-ch-test-poll1 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Mar 27, 2026

clickhouse-gh Bot added a commit that referenced this pull request Mar 27, 2026

Merge pull request #100857 from ClickHouse/backport/26.3/99483

c152d91

Backport #99483 to 26.3: Fix heap-use-after-free in MergeTreeReadTask::createReaders

azat mentioned this pull request Mar 27, 2026

Remove ConstStoragePtr from the MergeTreeData::SnapshotData #95612

Open

nikitamikhaylov added a commit that referenced this pull request Mar 31, 2026

Merge pull request #100853 from ClickHouse/backport/26.1/99483

4f6e92f

Backport #99483 to 26.1: Fix heap-use-after-free in MergeTreeReadTask::createReaders

nikitamikhaylov added a commit that referenced this pull request Mar 31, 2026

Merge pull request #100855 from ClickHouse/backport/26.2/99483

2c8425a

Backport #99483 to 26.2: Fix heap-use-after-free in MergeTreeReadTask::createReaders

groeneai mentioned this pull request Apr 1, 2026

Fix 04039_merge_tree_snapshot_teardown_race on release branches #101510

Closed

		@@ -0,0 +1,2 @@
		- `ReadFromMergeTree::initializePipeline` can replace `storage_snapshot->data` during `merge_tree_enable_remove_parts_from_snapshot_optimization`; when stripping `SnapshotData::parts`, preserve `SnapshotData::storage` or `IMergeTreeDataPart::clearCaches` can observe a dangling `data_part->storage`.

		@@ -0,0 +1,2 @@
		- `ReadFromMergeTree::initializePipeline` can replace `storage_snapshot->data` during `merge_tree_enable_remove_parts_from_snapshot_optimization`; when stripping `SnapshotData::parts`, preserve `SnapshotData::storage` or `IMergeTreeDataPart::clearCaches` can observe a dangling `data_part->storage`.
		- `MergeTreeReadPoolBase` must retain the storage referenced by `MergeTreeData::SnapshotData::storage` or the parts themselves, not blindly `storage_snapshot->storage`, because wrappers such as `StorageFromMergeTreeProjection` are not the same object as `data_part->storage`.

Conversation

alexey-milovidov commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Uh oh!

clickhouse-gh Bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Review

Summary

ClickHouse Rules

Final Verdict

Uh oh!

Uh oh!

alexey-milovidov commented Mar 14, 2026

Root cause analysis

Destruction chain (from CI ASan traces)

Root cause

Fix in this PR

Uh oh!

alexey-milovidov commented Mar 14, 2026

Uh oh!

clickhouse-gh Bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

clickhouse-gh Bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

azat Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

azat left a comment

Choose a reason for hiding this comment

Uh oh!

azat Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

azat Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

azat Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

alexey-milovidov commented Mar 16, 2026

Uh oh!

clickhouse-gh Bot commented Mar 24, 2026

LLVM Coverage Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

alexey-milovidov commented Mar 14, 2026 •

edited

Loading

clickhouse-gh Bot commented Mar 14, 2026 •

edited

Loading