Skip to content

Conversation

@clee704
Copy link
Contributor

@clee704 clee704 commented May 29, 2024

This file has a single row group with 0 row and 2 columns.

The first column chunk has the following key-value metadata:

  • The first entry has a key "foo" mapped to a value "bar".
  • The second entry has a key "thisiskeywithoutvalue" and does not have a value.

Created with this code, with a modified Arrow library to output "thisiskeywithoutvalue" without a value for the first column.

PARQUET_ASSIGN_OR_THROW(auto sink,
                        arrow::io::FileOutputStream::Open(
                            "column_chunk_key_value_metadata.parquet"));
auto writer = parquet::ParquetFileWriter::Open(
    sink, std::static_pointer_cast<parquet::schema::GroupNode>(
              parquet::schema::GroupNode::Make(
                  "schema", parquet::Repetition::REQUIRED,
                  {parquet::schema::PrimitiveNode::Make(
                       "column1", parquet::Repetition::OPTIONAL,
                       parquet::Type::INT32),
                   parquet::schema::PrimitiveNode::Make(
                       "column2", parquet::Repetition::OPTIONAL,
                       parquet::Type::INT32)})));
auto rg_writer = writer->AppendRowGroup();
rg_writer->NextColumn()->key_value_metadata().Append("foo", "bar");
rg_writer->NextColumn();

This is for apache/arrow#41580

This file has a single row group with 0 row and 1 column. The column
chunk has key-value metadata, with a key "foo" mapped to a value "bar".

Created with this code:

```c++
PARQUET_ASSIGN_OR_THROW(
    auto sink, arrow::io::FileOutputStream::Open(
                   "column-chunk-key-value-metadata.parquet"));
parquet::ParquetFileWriter::Open(
    sink, std::static_pointer_cast<parquet::schema::GroupNode>(
              parquet::schema::GroupNode::Make(
                  "schema", parquet::Repetition::REQUIRED,
                  {parquet::schema::PrimitiveNode::Make(
                      "column1", parquet::Repetition::OPTIONAL,
                      parquet::Type::INT32)})))
    ->AppendRowGroup()
    ->NextColumn()
    ->key_value_metadata()
    .Append("foo", "bar");
```
@wgtmac
Copy link
Member

wgtmac commented May 29, 2024

Would you mind adding some description about this new file in the README?

@pitrou
Copy link
Member

pitrou commented Jun 3, 2024

Suggestion: add two metadata entries: one with a value, the other without (the metadata key is mandatory while the value is optional).

clee704 and others added 3 commits July 20, 2024 20:34
Co-authored-by: mwish <maplewish117@gmail.com>
@mapleFU mapleFU merged commit 9b48ff4 into apache:master Jul 21, 2024
@mapleFU
Copy link
Member

mapleFU commented Jul 21, 2024

Merged, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants