Skip to content

Support writing hive style partitioned files in COPY command #8493

@alamb

Description

@alamb

Is your feature request related to a problem or challenge?

A user asked on ASF Slack: https://the-asf.slack.com/archives/C04RJ0C85UZ/p1702248979379239

Does the COPY command support creating parquet files that are partitioned using hive style partitioning?

The usecase is creating Hive-sty;e partitioned datasets (e.g as described here)

DataFusion does not support this today, but you can use an external table like this https://github.com/apache/arrow-datafusion/blob/93b21bdcd3d465ed78b610b54edf1418a47fc497/datafusion/sqllogictest/test_files/insert.slt#L45-L57

Describe the solution you'd like

@devinjdangelo notes that

The COPY statement does not have a built in PARTITION BY clause in its syntax currently, but we could support syntax like:

COPY table to 'folder/location' (format parquet, partition_by year)

which is the same syntax that duckdb supports for this.

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions