Skip to content

[Feature Request]: IcebergIO should support table create-on-write #32677

@kellen

Description

@kellen

What would you like to happen?

BigQueryIO supports a CreateDisposition (NEVER, CREATE_IF_NEEDED) and any file-based IO requires no additional create step. Our (Spotify's) internal users are comfortable with and indeed expect this behavior.

We have added iceberg support to in scio spotify/scio#5494 but for example in the integration test we must issue createTable requests before the test is run:
https://github.com/spotify/scio/pull/5494/files#diff-5116ce37fed90178a3919b87a160d5795f51beb644117d63340ba55ffbf45b46R66-R71

I think it is reasonable to expect the catalog/namespace to exist but table creation could be automatically supported in the IO

Issue Priority

Priority: 2 (default / most feature requests should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions