S3Queue ordered mode support more generic partitioning#94321
Merged
S3Queue ordered mode support more generic partitioning#94321
Conversation
Contributor
|
Workflow [PR], commit [87a3902] Summary: ❌
|
kssenii
reviewed
Jan 15, 2026
src/Storages/ObjectStorageQueue/ObjectStorageQueueOrderedFileMetadata.cpp
Outdated
Show resolved
Hide resolved
src/Storages/ObjectStorageQueue/ObjectStorageQueueOrderedFileMetadata.cpp
Outdated
Show resolved
Hide resolved
src/Storages/ObjectStorageQueue/ObjectStorageQueueFilenameParser.cpp
Outdated
Show resolved
Hide resolved
src/Storages/ObjectStorageQueue/ObjectStorageQueueFilenameParser.cpp
Outdated
Show resolved
Hide resolved
f9a4a74 to
ef6c838
Compare
93ca7ae to
0cb3a4c
Compare
0cb3a4c to
252d8ac
Compare
kssenii
reviewed
Jan 16, 2026
kssenii
reviewed
Jan 16, 2026
src/Storages/ObjectStorageQueue/ObjectStorageQueueOrderedFileMetadata.cpp
Outdated
Show resolved
Hide resolved
src/Storages/ObjectStorageQueue/ObjectStorageQueueFilenameParser.cpp
Outdated
Show resolved
Hide resolved
src/Storages/ObjectStorageQueue/ObjectStorageQueueFilenameParser.cpp
Outdated
Show resolved
Hide resolved
src/Storages/ObjectStorageQueue/ObjectStorageQueueFilenameParser.cpp
Outdated
Show resolved
Hide resolved
11596ea to
ffefba4
Compare
ffefba4 to
2514221
Compare
kssenii
approved these changes
Jan 19, 2026
src/Storages/ObjectStorageQueue/ObjectStorageQueueOrderedFileMetadata.cpp
Outdated
Show resolved
Hide resolved
…etadata.cpp Co-authored-by: Kseniia Sumarokova <kssenii@clickhouse.com>
Contributor
Author
|
Test failures:
|
1 task
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Following #81040. Try and implement a more generic way to support partition tracking. This PR introduces a new enum that can be the following:
none- no dedicated partition tracking (max seen file is only tracked per bucket)hive- this is for the newkey=valuepairs for hive which maintains max seen filename per bucket and per key=value pair.regex- this is being added in this PR so that the user can specify a flexible partition key from filename.Specify
partition_regexwhich is a namedre2regex expression andpartition_componentwhich specifies which component in the capture group should be used for partitioning. An example as follows:server-1_20251217T100000.000000Z_0001.csvr'(?P<hostname>[^_]+)_(?P<timestamp>\d{8}T\d{6}\.\d{6}Z)_(?P<sequence>\d+)'hostnameIn the above example, the partitioning per bucket will be done on
hostname.Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):
Support more generic partitioning for S3Queue ordered mode.
Documentation entry for user-facing changes