Skip to content

Compress: add MarshalText and UnmarshalText#357

Merged
zeroshade merged 2 commits intoapache:mainfrom
jdemeyer:compression-unmarshaltext
Apr 18, 2025
Merged

Compress: add MarshalText and UnmarshalText#357
zeroshade merged 2 commits intoapache:mainfrom
jdemeyer:compression-unmarshaltext

Conversation

@jdemeyer
Copy link
Copy Markdown
Contributor

Rationale for this change

Right now, there is no public API to convert a string like "ZSTD" to a compression codec. There is only https://pkg.go.dev/github.com/apache/arrow-go/v18@v18.2.0/parquet/internal/gen-go/parquet#CompressionCodecFromString but that can't be used since it's in an internal package.

Adding UnmarshalText can be used for applications taking such a string as configuration option. Adding MarshalText isn't strictly needed since it overlaps with String() but makes sense for symmetry.

What changes are included in this PR?

Add UnmarshalText and MarshalText as thin wrappers around the same methods from the internal https://pkg.go.dev/github.com/apache/arrow-go/v18@v18.2.0/parquet/internal/gen-go/parquet#CompressionCodec type.

Are these changes tested?

Yes

Are there any user-facing changes?

Minor new API for Compression type

@jdemeyer jdemeyer requested a review from zeroshade as a code owner April 17, 2025 12:27
Comment on lines +142 to +154
func TestMarshalText(t *testing.T) {
compression := compress.Codecs.Zstd
data, err := compression.MarshalText()
assert.NoError(t, err)
assert.Equal(t, "ZSTD", string(data))
}

func TestUnmarshalText(t *testing.T) {
var compression compress.Compression
err := compression.UnmarshalText([]byte("ZSTD"))
assert.NoError(t, err)
assert.Equal(t, compress.Codecs.Zstd, compression)
}
Copy link
Copy Markdown
Member

@zeroshade zeroshade Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured we should probably use a table test and test all the codecs. i.e.

func TestMarshalText(t *testing.T) {
    tests := []struct{
            codec ....
            text string
    }{
         {compress.Codecs.Zstd, "ZSTD"},
         {compress.Codecs.Snappy, "SNAPPY"},
          ...
      }

      for _, tt := range tests {
          t.Run(tt.text, func(t *testing.T) {
               data, err := tt.codec.MarshalText()
               assert.NoError(t, err)
               assert.Equal(t, tt.text, string(data))
               ...
          })
      }
}

and so on

Copy link
Copy Markdown
Member

@zeroshade zeroshade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this! This is helpful and I'll probably end up updating iceberg-go to utilize it

@zeroshade zeroshade merged commit 4e6db6f into apache:main Apr 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants