Skip to content

feat: Implement PARTITION BY support on ClickHouse destination.#19596

Merged
kodiakhq[bot] merged 2 commits intomainfrom
mariano/support-partition-by-clickhouse
Nov 13, 2024
Merged

feat: Implement PARTITION BY support on ClickHouse destination.#19596
kodiakhq[bot] merged 2 commits intomainfrom
mariano/support-partition-by-clickhouse

Conversation

@marianogappa
Copy link
Copy Markdown
Contributor

Enables optional support for partitioning tables in ClickHouse destination.

Because this came up in discussions as a potential eyebrow raiser, note that I decided to expose the partitioning as a pass-through string, rather than a list of columns, which was suggested:

  • According to docs and according to our research for using it, I've never seen a case of partitioning by more than one column, so a list of columns seems like a feature nobody needs.
  • According to docs and according to our research for using it, it's much more common to partition by an expression rather than by a column, e.g. toYYYYMM(_cq_sync_time). So partitioning by column names would make this feature unusable for many use cases, and also I wouldn't want us to need to implement a semantic validation of all possible expressions that can be used here.

Docs:

Partitioning

This option allows to specify a partitioning strategy to be used for tables. It is an array of objects.

Each object has the following fields:

  • tables (array of strings) (optional) (default: ["*"])

    List of glob patterns to match table names against. Follows the same rules as the top-level spec tables option.

    If a table matches both a pattern in tables and skip_tables, the table will be skipped.

    Partition strategy table patterns should be disjointed sets: if a table matches two partition strategies, an error will be raised at runtime.

  • skip_tables (array of strings) (optional) (default: empty)

    List of glob patterns to skip matching table names against. Follows the same rules as the top-level spec skip_tables option.

    If a table matches both a pattern in tables and skip_tables, the table will be skipped.

    Partition strategy table patterns should be disjointed sets: if a table matches two partition strategies, an error will be raised at runtime.

  • partition_by (string) (required)

    Partitioning strategy to use, e.g. toYYYYMM(_cq_sync_time), the string is passed as is after "PARTITION BY" clause with no validation or quoting.

    An unset partition_by is not valid.

Example:

partition:
- tables: ["*"]
  skip_tables: ["special_partition_table", "non_partitioned_table"]
  partition_by: "toYYYYMM(_cq_sync_time)"
- tables: ["special_partition_table"]
  partition_by: "toYYYYMMDD(_cq_sync_time)"

@cq-bot
Copy link
Copy Markdown
Contributor

cq-bot commented Nov 13, 2024

Copy link
Copy Markdown
Member

@erezrokah erezrokah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍 Makes sense to have it as a passthrough string and not a list of columns

@marianogappa marianogappa marked this pull request as ready for review November 13, 2024 12:05
@marianogappa marianogappa requested review from a team and jon-s58 and removed request for a team November 13, 2024 12:05
@marianogappa
Copy link
Copy Markdown
Contributor Author

I've tested it successfully
Screenshot 2024-11-13 at 16 04 48

@marianogappa marianogappa added the automerge Automatically merge once required checks pass label Nov 13, 2024
@kodiakhq kodiakhq bot merged commit 503f42a into main Nov 13, 2024
@kodiakhq kodiakhq bot deleted the mariano/support-partition-by-clickhouse branch November 13, 2024 12:09
kodiakhq bot pushed a commit that referenced this pull request Nov 13, 2024
🤖 I have created a release *beep* *boop*
---


## [5.1.0](plugins-destination-clickhouse-v5.0.9...plugins-destination-clickhouse-v5.1.0) (2024-11-13)


### Features

* Implement PARTITION BY support on ClickHouse destination. ([#19596](#19596)) ([503f42a](503f42a))


### Bug Fixes

* **deps:** Update dependency @types/jest to v29.5.14 ([#19544](#19544)) ([f0340e5](f0340e5))
* **deps:** Update dependency @types/node to v16.18.119 ([#19545](#19545)) ([299926d](299926d))
* **deps:** Update material-ui monorepo ([#19548](#19548)) ([c3f765e](c3f765e))

---
This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/plugin/destination/clickhouse automerge Automatically merge once required checks pass

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants