Skip to content

[Format] Add an Arrow Canonical Extension Type for Parquet Variant #46908

@alamb

Description

@alamb

Describe the enhancement requested

Parquet has added a new type for semi-structured data called Variant which is defined here:

As it is common for engines to read data from Parquet into Arrow for in memory processing it is useful to have support for Variant in Arrow. @CurtHagenlocher proposes adding native Variant support in the Arrow format itself here:

An alternate approach is to add a Canonical Extension Type

@zeroshade wrote up a proposal

And implemented an implementation in Go

This ticket tracks the idea of adding Variant as an official extension type

See also @neilechao 's PR to add variant read support to parquet

Component(s)

Format

Metadata

Metadata

Assignees

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions