Skip to content

Share S3 client cache per bucket#96802

Open
zvonand wants to merge 17 commits intoClickHouse:masterfrom
zvonand:fix/92482
Open

Share S3 client cache per bucket#96802
zvonand wants to merge 17 commits intoClickHouse:masterfrom
zvonand:fix/92482

Conversation

@zvonand
Copy link
Copy Markdown
Contributor

@zvonand zvonand commented Feb 13, 2026

Closes #92482

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Share S3 ClientCache per bucket, reduce repeated region discovery.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@scanhex12 scanhex12 added the can be tested Allows running workflows for external contributors label Feb 13, 2026
@scanhex12 scanhex12 self-assigned this Feb 13, 2026
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Feb 13, 2026

Workflow [PR], commit [1f1d146]

Summary:

job_name test_name status info comment
Stateless tests (amd_asan_ubsan, flaky check) failure
03668_shard_join_in_reverse_order FAIL cidb, issue
03668_shard_join_in_reverse_order FAIL cidb, issue
Stateless tests (amd_tsan, flaky check) failure
03668_shard_join_in_reverse_order FAIL cidb, issue
Stateless tests (amd_msan, flaky check) failure
03668_shard_join_in_reverse_order FAIL cidb, issue
Stateless tests (amd_debug, flaky check) failure
03668_shard_join_in_reverse_order FAIL cidb, issue
03668_shard_join_in_reverse_order FAIL cidb, issue
03668_shard_join_in_reverse_order FAIL cidb, issue
03668_shard_join_in_reverse_order FAIL cidb, issue
03668_shard_join_in_reverse_order FAIL cidb, issue
Stress test (arm_msan) failure
Server died FAIL cidb
MemorySanitizer: use-of-uninitialized-value (STID: 1003-358c) FAIL cidb, issue

AI Review

Summary

This PR introduces shared S3::ClientCache reuse per (endpoint, bucket) for the disk S3 path to avoid repeated region discovery and redundant cache warmup across clients. I reviewed the changed C++ logic (ClientCacheRegistry, Client cloning/cache ownership, S3 disk client wiring, and new gtests) and did not find new correctness, concurrency, safety, or compatibility blockers in the current diff.

Missing context
  • ⚠️ Full CI has not finished yet (Code Review, Fast test, and Build (arm_tidy) were still in progress during review), so no final signal from those jobs was available.
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality
Safe rollout
Compilation time
Final Verdict
  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh bot added the pr-performance Pull request with some performance improvements label Feb 13, 2026
@zvonand
Copy link
Copy Markdown
Contributor Author

zvonand commented Feb 16, 2026

The only fail is CH Inc sync (nothing I can do about it)

Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good (except the string concatenation hackery, which must be fixed). But is it true that the same client can be used across the bucket? I think there could be corner cases when different IAM roles are authorised to different parts of the bucket.

@alexey-milovidov alexey-milovidov self-assigned this Mar 22, 2026
@zvonand

This comment was marked as outdated.

@zvonand
Copy link
Copy Markdown
Contributor Author

zvonand commented Mar 22, 2026

After refreshing this in mind: the clients themselves are not shared. The only thing that is shared is ClientCache; only region and URI. Credentials are not involved here.

@alexey-milovidov
Copy link
Copy Markdown
Member

@zvonand, sounds good! Please address review, fix failures, and let's merge.

@zvonand

This comment was marked as outdated.

@zvonand
Copy link
Copy Markdown
Contributor Author

zvonand commented Mar 26, 2026

Reviews are addressed.
Stateless fails are not related and have issues opened. Unit tests are failing in latest commits in master in a similar way.

@alexey-milovidov

@alexey-milovidov
Copy link
Copy Markdown
Member

Please fix failed tests.

@zvonand

This comment was marked as outdated.

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Apr 1, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.00% -0.10%
Functions 90.90% 90.90% +0.00%
Branches 76.60% 76.50% -0.10%

Changed lines: 88.32% (189/214) | lost baseline coverage: 12 line(s) · Uncovered code

Full report · Diff report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-performance Pull request with some performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

massive S3 request can run suboptimal

4 participants