Iceberg: support external paths in tables by zvonand · Pull Request #90740 · ClickHouse/ClickHouse

zvonand · 2025-11-24T16:14:38Z

Current logic is that all table files (data files, manifests, manifest lists) shall be "inside" the table location, in the same storage. This PR abandons that logic: now files can be located anywhere, even on a different storage type (e.g. all metadata is on s3, data file in in local storage).

In some cases, that old logic shoots back: even incorrect (non-existing) paths, i.e. with redundant items in a prefix were parsed in such a way that files from another locations were read (/path/to/item was read when the specified location was /path/to/to/item or vice versa)

Changelog category (leave one):

Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Support Iceberg tables that have files outside table location or on different storage.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

Note

High Risk
Touches Iceberg metadata iteration/compaction/expiry and distributed cluster-function serialization, plus adds dynamic secondary storage creation and optional credential propagation; regressions could lead to wrong files being read or deleted across storages/endpoints.

Overview
Enables Iceberg tables to reference absolute paths and files outside the table location, including files stored on different object storage backends (S3/Azure/HDFS/local). Iceberg manifest/manifest-list handling is updated to treat metadata paths as absolute URIs, resolve them at read/delete time via new object-storage utilities, and carry resolved storage info through iterators/transforms/compaction/expire-snapshots.

Bumps cluster-function processing protocol to add Iceberg absolute-path support and extends IcebergObjectSerializableInfo to transmit both the metadata path and absolute path (with a coordinator-side guard for old workers when external storage is required). Also adds _path/task distribution handling to prefer absolute paths when available.

Introduces new object-storage helpers (SchemeAuthorityKey, makeAbsolutePath, resolveObjectStorageForPath, SecondaryStorages) and a setting s3_propagate_credentials_to_other_storages to optionally copy base S3 credentials when creating secondary storages. Adds an integration test covering multi-storage Iceberg layouts and updates existing test metadata paths accordingly.

^{Written by Cursor Bugbot for commit 66d1acc. This will update automatically on new commits. Configure here.}

clickhouse-gh · 2025-11-26T14:52:09Z

Workflow [PR], commit [2863324]

Summary: ❌

job_name	test_name	status	info
Stateless tests (arm_asan_ubsan, targeted)		failure
	03758_fix_isssue_87414	FAIL	cidb
	00534_functions_bad_arguments3	FAIL	cidb
Integration tests (amd_asan_ubsan, targeted)		failure
	test_backup_restore_on_cluster/test_concurrency.py::test_create_or_drop_tables_during_backup[Replicated-ReplicatedMergeTree]	FAIL	cidb
	test_backup_restore_on_cluster/test_concurrency.py::test_create_or_drop_tables_during_backup[Memory-MergeTree]	FAIL	cidb
	test_backup_restore_on_cluster/test_concurrency.py::test_kill_mutation_during_backup	FAIL	cidb
Integration tests (amd_binary, 2/5)		failure
	test_storage_kafka/test_batch_fast.py::test_kafka_commit_on_block_write[generate_old_create_table_query]	FAIL	cidb
Integration tests (amd_llvm_coverage, 2/5)		failure
	test_overcommit_tracker/test.py::test_user_overcommit	FAIL	cidb
Stress test (amd_debug)		failure
	Hung check failed, possible deadlock found	FAIL	cidb
Stress test (amd_tsan)		failure
	Logical error: '(isConst() \|\| isSparse() \|\| isReplicated()) ? getDataType() == rhs.getDataType() : typeid(*this) == typeid(rhs)' (STID: 2508-324f)	FAIL	cidb
Stress test (arm_msan)		failure
	MemorySanitizer: use-of-uninitialized-value (STID: 2410-47e0)	FAIL	cidb

AI Review

Summary

This PR enables Iceberg to read metadata/data/delete files from absolute paths and external storages, including distributed execution support. The main remaining concern is a server-side local file access bypass: file:// path validation can be circumvented through symlink traversal under user_files, which keeps arbitrary local reads reachable from untrusted Iceberg metadata.

Missing context

⚠️ No CI runtime logs/failed-job artifacts were provided in this review context.

Findings

❌ Blockers

[src/Storages/ObjectStorage/Utils.cpp:535] file:// validation relies on lexical normalization and prefix checks before creating LocalObjectStorage, but reads later happen via raw filesystem open. A symlink under user_files (for example, user_files/link -> /etc) allows file:///.../user_files/link/... to pass validation and read outside user_files.
Suggested fix: enforce canonical-path validation (resolve symlinks) for final target, or reject symlink components on the path before opening.

Tests

⚠️ Add an integration test for Iceberg file:// paths with a symlink inside user_files pointing outside, and assert PATH_ACCESS_DENIED.

ClickHouse Rules

Item	Status	Notes
Deletion logging	✅
Serialization versioning	✅
Core-area scrutiny	✅
No test removal	✅
Experimental gate	➖
No magic constants	✅
Backward compatibility	✅
`SettingsChangesHistory.cpp`	✅
PR metadata quality	✅
Safe rollout	⚠️	Security hardening needed before safe rollout for local paths.
Compilation time	✅

Performance & Safety

Local-file safety regression risk: untrusted metadata can redirect reads outside allowed roots through symlink traversal on file:// paths.

Final Verdict

Status: ⚠️ Request changes
Minimum required actions:
- Harden file:// path enforcement against symlink traversal in resolveObjectStorageForPath/local-read path.
- Add an integration test covering the symlink-escape scenario.

zvonand · 2025-12-04T08:39:56Z

Code works and test works, but test_storage_iceberg_with_spark/test_cluster_table_function.py::test_cluster_table_function_split_by_row_groups[1-100-s3-1] failed once

Looks like test itself is flaky: https://s3.amazonaws.com/clickhouse-test-reports/json.html?REF=master&sha=b3f2b27c84f775077c1249dc87192bec58b4982a&name_0=MasterCI&name_1=Integration%20tests%20%28amd_tsan%2C%203%2F6%29&name_1=Integration%20tests%20%28amd_tsan%2C%203%2F6%29

PR with fix: #91468

zvonand · 2025-12-04T11:11:16Z

Other test fails:

02126_dist_desc -- flaky
00816_long_concurrent_alter_column -- flaky
test_storage_rabbitmq/test.py::test_hiding_credentials -- flaky for a long time
test_storage_s3_queue/test_5.py::test_failed_startup -- flaky

CheSema · 2025-12-18T10:31:31Z

src/Disks/DiskObjectStorage/ObjectStorages/IObjectStorage.h

    /// Object metadata: size, modification time, etc.
    std::optional<ObjectMetadata> metadata;
+    std::optional<String> absolute_path;
+    ObjectStoragePtr object_storage_to_use = nullptr;


I do not like this idea.

This feature is related to the iceberg but it brings complexity to the interface which makes other realization overcomplicated.

Lest make the code in Iceberg resolve external links.

I think it is ok in this case

brings complexity to the interface

PathWithMetadata is not part of IObjectStorage interface at all -- it's a standalone helper struct. It can even be moved to a standalone header and impl. This structure is perfect for this logic. Adding an optional field doesn't complicate it.

The alternative will be way messier -- juggling <path, storage> tuples everywhere, doing repeated lookups for every file, introducing some cache that will resolve path into object storage and key, or even introducing a struct that will be almost similar and serve for the same purpose. Neither of these will help make it less complicated. Encapsulating it here is clean and natural.

The new logic in this struct is completely optional -- in other places (where we don't need it) it can be used old way, existing code is unaffected.

It looks like you know what to do: p.2.

My opinion that 1k rows for the particular feature in the designated place it much better then 1k rows scattered every where which you have to ignore in the most cases.

Of course, but here we do not have that "ideal" case: the intersection of Iceberg and generic object storage logic and other server logic is large, it is not always encapsulated. There are already many places in code where we have a generic workflow but have specific branching for Iceberg, e.g. like this simple one. Iceberg module does not work with object storages and data files at lower level, that logic remains outside Iceberg.

Current code is actually quite isolated (not 100%, but maybe 80%). Most of the changes outside Iceberg territory is renaming RelativePathWithMetadata -> PathWithMetadata (and also corresponding variables), which does not in any way worsen the code itself. There is also passing additional variables to methods. Now they are only used in Iceberg workflow, but if at some point we need similar multi-storage data lake logic somewhere else -- they will be useful (I do not know whether it will or won't happen, but it may). Sometimes these arguments are passed and checked in an if -- which does not affect other code that does not pass those.

tl;dr: Sure, this code can be improved to be isolated better. This would involve a lot of changes to current structure, but I am not sure this is justified. Iceberg module heavily delegates actual data reading and interaction with object storages outside (which is IMO totally right), to our usual reading-from-object-storage classes, iterators and methods.

I'm agree with @CheSema, let's cleanup this

Copilot

Pull request overview

This PR enables Iceberg tables to support files stored outside the table's primary location, including on different storage types (e.g., metadata on S3, data files on local storage). The implementation abandons the previous constraint that all table files must be within the same storage location.

Key changes:

Added support for resolving absolute file paths to appropriate storage backends
Introduced SecondaryStorages mechanism to manage multiple storage backends per table
Extended metadata structures to track absolute paths alongside relative paths

Reviewed changes

Copilot reviewed 69 out of 70 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`src/Storages/ObjectStorage/Utils.h/cpp`	Core logic for resolving paths to storage backends, including `resolveObjectStorageForPath()` and `SchemeAuthorityKey`
`src/Storages/ObjectStorage/DataLakes/Iceberg/*.h/cpp`	Updated Iceberg components to use absolute paths and secondary storages
`src/Storages/ObjectStorage/StorageObjectStorageSource.cpp`	Modified source to resolve and use secondary storages for reading
`src/Disks/DiskObjectStorage/ObjectStorages/IObjectStorage.h`	Renamed `RelativePathWithMetadata` to `PathWithMetadata`, added absolute path support
`src/Core/ProtocolDefines.h`	Added protocol version for Iceberg absolute paths in cluster processing
`tests/integration/test_storage_iceberg_multistorage/`	New integration tests for multi-storage scenarios
`tests/queries/0_stateless/data_minio/*/metadata/v1.metadata.json`	Updated test fixtures to reflect corrected table locations

src/Storages/ObjectStorage/Utils.cpp

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergDataObjectInfo.cpp

tests/integration/test_storage_iceberg_multistorage/test_table_in_multiple_locations.py

src/Storages/ObjectStorage/DataLakes/Iceberg/Compaction.cpp

src/Storages/ObjectStorage/DataLakes/Iceberg/PositionDeleteTransform.cpp

src/Storages/ObjectStorage/IObjectIterator.cpp

scanhex12 · 2025-12-24T13:46:10Z

src/Disks/DiskObjectStorage/ObjectStorages/IObjectStorage.h

    /// Object metadata: size, modification time, etc.
    std::optional<ObjectMetadata> metadata;
+    std::optional<String> absolute_path;
+    ObjectStoragePtr object_storage_to_use = nullptr;


I'm agree with @CheSema, let's cleanup this

scanhex12 · 2025-12-24T13:53:47Z

src/Storages/ObjectStorage/Utils.h

 class IObjectStorage;

+/// Thread-safe wrapper for secondary object storages map
+struct SecondaryStorages


Secondary sounds strange in this case. How do you understand that object storage is primary or secondary?

And why don't we need to put "primary" storage to this map with secondary storages?

Because of the logic itself:

We have our "main" object storage, where we read from by default. But it may happen that we stumble upon a file that belongs to another storage -- thus that storage will be socondary.

tests/queries/0_stateless/data_minio/field_ids_complex_test/metadata/v1.metadata.json

tests/integration/test_storage_iceberg_multistorage/test_table_in_multiple_locations.py

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergDataObjectInfo.cpp

src/Storages/ObjectStorage/DataLakes/Iceberg/StatelessMetadataFileGetter.cpp

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergDataObjectInfo.cpp

src/Storages/ObjectStorage/DataLakes/Iceberg/Compaction.cpp

src/Storages/ObjectStorage/StorageObjectStorageSource.cpp

src/Storages/ObjectStorage/Utils.cpp

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

src/Storages/ObjectStorage/DataLakes/Iceberg/Mutations.cpp

src/Storages/ObjectStorage/Utils.cpp

rienath · 2026-03-13T13:47:29Z

Please fix the conflict and merge master to make the stress test happy (it was fixed)

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

src/Storages/ObjectStorage/Utils.cpp

src/Storages/ObjectStorage/DataLakes/Iceberg/Compaction.cpp

src/Core/Settings.cpp

…g-allow-external-paths

src/Storages/ObjectStorage/Utils.cpp

…g-allow-external-paths

clickhouse-gh · 2026-03-30T14:19:47Z

src/Storages/ObjectStorage/Utils.cpp

+    /// Without this check, metadata could drive reads from arbitrary local paths.
+    if (target_scheme_normalized == "file")
+    {
+        ensure_local_path_inside_user_files(target_decomposed.key);


file:// paths are currently constrained to user_files, but this check can be bypassed through symlinks: ensure_local_path_inside_user_files validates the target with lexically_normal, while the created LocalObjectStorage later reads remote_path directly (createReadBufferFromFileBase) without resolving symlinks.

A concrete trace:

user_files = /var/lib/clickhouse/user_files.

Metadata path = file:///var/lib/clickhouse/user_files/link/secret.parquet, where link -> /etc.

ensure_local_path_inside_user_files sees lexical path /var/lib/clickhouse/user_files/link/secret.parquet and allows it.

LocalObjectStorage::readObject opens /var/lib/clickhouse/user_files/link/secret.parquet, which dereferences to /etc/secret.parquet.

So untrusted metadata can still escape user_files via symlink traversal.

Can we harden resolveObjectStorageForPath by validating with canonicalized paths (resolving symlinks), or by rejecting any path component that is a symlink before opening?

this will be secured in #99405. no need to invent bicycles here.

clickhouse-gh · 2026-03-30T18:39:39Z

LLVM Coverage Report

Metric	Baseline	Current	Δ
Lines	84.20%	84.10%	-0.10%
Functions	90.90%	90.90%	+0.00%
Branches	76.70%	76.60%	-0.10%

Changed lines: 52.53% (644/1226) · Uncovered code

Full report · Diff report

zvonand force-pushed the iceberg-allow-external-paths branch from 6e66e6d to 8099db2 Compare November 24, 2025 16:17

zvonand changed the title ~~Iceberg: support external paths~~ Iceberg: support external paths in tables Nov 24, 2025

This comment was marked as outdated.

Sign in to view

zvonand force-pushed the iceberg-allow-external-paths branch 3 times, most recently from 911aa6d to b2ebefa Compare November 25, 2025 18:40

kssenii added the can be tested Allows running workflows for external contributors label Nov 26, 2025

divanik self-assigned this Nov 26, 2025

clickhouse-gh bot added the pr-improvement Pull request with some product improvements label Nov 26, 2025

zvonand force-pushed the iceberg-allow-external-paths branch 2 times, most recently from 08b5af5 to 338a036 Compare November 28, 2025 19:54

divanik marked this pull request as draft December 2, 2025 14:15

zvonand marked this pull request as ready for review December 4, 2025 11:11

CheSema reviewed Dec 18, 2025

View reviewed changes

CheSema requested a review from Copilot December 18, 2025 10:32

Copilot AI reviewed Dec 18, 2025

View reviewed changes

scanhex12 self-assigned this Dec 20, 2025

zvonand force-pushed the iceberg-allow-external-paths branch from 53123da to c52f465 Compare December 22, 2025 16:36

scanhex12 reviewed Dec 24, 2025

View reviewed changes

This comment was marked as outdated.

Sign in to view

divanik removed their assignment Dec 29, 2025

zvonand force-pushed the iceberg-allow-external-paths branch 3 times, most recently from aa70623 to fd7b030 Compare January 5, 2026 21:55

zvonand force-pushed the iceberg-allow-external-paths branch 3 times, most recently from 7a84163 to dc9420d Compare January 6, 2026 22:37

fix deletion

5a3e0bf

zvonand force-pushed the iceberg-allow-external-paths branch from e608ff6 to 5a3e0bf Compare March 11, 2026 23:14

cursor bot reviewed Mar 11, 2026

View reviewed changes

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergDataObjectInfo.cpp Outdated Show resolved Hide resolved

src/Storages/ObjectStorage/DataLakes/Iceberg/StatelessMetadataFileGetter.cpp Outdated Show resolved Hide resolved

clickhouse-gh bot reviewed Mar 11, 2026

View reviewed changes

src/Storages/ObjectStorage/DataLakes/Iceberg/IcebergDataObjectInfo.cpp Show resolved Hide resolved

improve working with old runners; fix using wrong path

0c233c4

cursor bot reviewed Mar 12, 2026

View reviewed changes

src/Storages/ObjectStorage/DataLakes/Iceberg/Compaction.cpp Show resolved Hide resolved

src/Storages/ObjectStorage/StorageObjectStorageSource.cpp Show resolved Hide resolved

src/Storages/ObjectStorage/Utils.cpp Show resolved Hide resolved

Merge branch 'master' into iceberg-allow-external-paths

3af72f8

cursor bot reviewed Mar 12, 2026

View reviewed changes

src/Storages/ObjectStorage/DataLakes/Iceberg/Mutations.cpp Outdated Show resolved Hide resolved

clickhouse-gh bot reviewed Mar 12, 2026

View reviewed changes

src/Storages/ObjectStorage/Utils.cpp Outdated Show resolved Hide resolved

resolve conflicts

66d1acc

cursor bot reviewed Mar 13, 2026

View reviewed changes

src/Storages/ObjectStorage/Utils.cpp Outdated Show resolved Hide resolved

src/Storages/ObjectStorage/DataLakes/Iceberg/Compaction.cpp Show resolved Hide resolved

zvonand added 2 commits March 15, 2026 11:40

fix conflict

9d6bcf2

guard passing creds to a non-secure endpoint implicitly

383e2df

clickhouse-gh bot reviewed Mar 15, 2026

View reviewed changes

src/Core/Settings.cpp Outdated Show resolved Hide resolved

zvonand and others added 5 commits March 15, 2026 13:10

update setting descr

420d875

Merge master into iceberg-allow-external-paths

3d3b55a

Merge branch 'master' of github.com:ClickHouse/ClickHouse into iceber…

94595e8

…g-allow-external-paths

fix encoding in paths

b1ede2e

Merge branch 'master' of github.com:ClickHouse/ClickHouse into iceber…

eb48bd2

…g-allow-external-paths

clickhouse-gh bot reviewed Mar 26, 2026

View reviewed changes

src/Storages/ObjectStorage/Utils.cpp Show resolved Hide resolved

guard local fs; fix settingschangeshistory

56b4587

clickhouse-gh bot reviewed Mar 26, 2026

View reviewed changes

src/Storages/ObjectStorage/Utils.cpp Show resolved Hide resolved

zvonand and others added 5 commits March 26, 2026 15:46

guard local path access (2)

86e2c37

fix compression

6036135

Merge branch 'master' of github.com:ClickHouse/ClickHouse into iceber…

3adc9f8

…g-allow-external-paths

fix log

512ce02

Merge branch 'master' into iceberg-allow-external-paths

2863324

clickhouse-gh bot reviewed Mar 30, 2026

View reviewed changes

Conversation

zvonand commented Nov 24, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Documentation entry for user-facing changes

Uh oh!

This comment was marked as outdated.

clickhouse-gh bot commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Review

Summary

Missing context

Findings

Tests

ClickHouse Rules

Performance & Safety

Final Verdict

Uh oh!

zvonand commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zvonand commented Dec 4, 2025

Uh oh!

CheSema Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zvonand Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CheSema Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zvonand Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

scanhex12 Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

scanhex12 Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

scanhex12 Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

scanhex12 Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

zvonand Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

zvonand commented Nov 24, 2025 •

edited by cursor bot

Loading

clickhouse-gh bot commented Nov 26, 2025 •

edited

Loading

zvonand commented Dec 4, 2025 •

edited

Loading

CheSema Dec 18, 2025 •

edited

Loading

zvonand Dec 22, 2025 •

edited

Loading

CheSema Dec 22, 2025 •

edited

Loading

zvonand Mar 30, 2026 •

edited

Loading