Skip to content

[GLUTEN-7028][CH][Part-10] Collecting Delta stats for MergeTree#8029

Merged
baibaichen merged 11 commits intoapache:mainfrom
baibaichen:feature/delta-stats
Dec 20, 2024
Merged

[GLUTEN-7028][CH][Part-10] Collecting Delta stats for MergeTree#8029
baibaichen merged 11 commits intoapache:mainfrom
baibaichen:feature/delta-stats

Conversation

@baibaichen
Copy link
Copy Markdown
Contributor

@baibaichen baibaichen commented Nov 24, 2024

What changes were proposed in this pull request?

(Fixes: #7028)

Similar with #7993, this PR supports collecting stats for mergetree. I remove the old compaction algorithm and replaced it with delta's implementaion for two reasons:

  1. After https://delta.io/blog/delta-lake-3-2/, delta optimazie command. supports Liquid clustering.
  2. It's impossible to collect stats with old compaction.

Notes
Since we conside droping support Bucket table, we can use delta's OptimizeTableCommand instead of gluen's version

There are some tests failed with one pipeline write, recored in #7028, we can fix it in the next PRs

How was this patch tested?

Uisng existed UTs

@github-actions
Copy link
Copy Markdown

#7028

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@baibaichen baibaichen marked this pull request as ready for review December 20, 2024 02:09
@loneylee
Copy link
Copy Markdown
Member

LGTM

@baibaichen baibaichen merged commit 0530292 into apache:main Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CH] Fully Support writing parquet and mergetree in spark 3.5.x with delta protocol

2 participants