Skip to content

util/parquet: add compression options#102978

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
jayshrivastava:parquet-compress
May 10, 2023
Merged

util/parquet: add compression options#102978
craig[bot] merged 1 commit intocockroachdb:masterfrom
jayshrivastava:parquet-compress

Conversation

@jayshrivastava
Copy link
Copy Markdown
Contributor

@jayshrivastava jayshrivastava commented May 9, 2023

This change updates the parquet writer to be able to use
GZIP, ZSTD, SNAPPY, and BROTLI compression codecs. By
default, no compression is used. LZO and LZ4 are unsupported
by the library.

Epic: https://cockroachlabs.atlassian.net/browse/CRDB-15071
Informs: #99028
Release note: None

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@jayshrivastava jayshrivastava marked this pull request as ready for review May 9, 2023 20:49
@jayshrivastava jayshrivastava requested a review from miretskiy May 9, 2023 20:49
var compressionCodecToParquet = map[CompressionCodec]compress.Compression{
CompressionNone: compress.Codecs.Uncompressed,
CompressionGZIP: compress.Codecs.Gzip,
CompressionZSTD: compress.Codecs.Zstd,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these the only 2 supported by parquet? no snappy? no lz4?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

	Uncompressed: Compression(parquet.CompressionCodec_UNCOMPRESSED),
	Snappy:       Compression(parquet.CompressionCodec_SNAPPY),
	Gzip:         Compression(parquet.CompressionCodec_GZIP),
	Lzo:          Compression(parquet.CompressionCodec_LZO),
	Brotli:       Compression(parquet.CompressionCodec_BROTLI),
	Lz4:          Compression(parquet.CompressionCodec_LZ4),
	Zstd:         Compression(parquet.CompressionCodec_ZSTD),

This change updates the parquet writer to be able to use
GZIP, ZSTD, SNAPPY, and BROTLI compression codecs. By
default, no compression is used. LZO and LZ4 are unsupported
by the library.

Epic: https://cockroachlabs.atlassian.net/browse/CRDB-15071
Informs: cockroachdb#99028
Release note: None
@jayshrivastava
Copy link
Copy Markdown
Contributor Author

bors r=miretskiy

@jayshrivastava jayshrivastava mentioned this pull request May 10, 2023
13 tasks
@craig
Copy link
Copy Markdown
Contributor

craig bot commented May 10, 2023

Build succeeded:

@craig craig bot merged commit a833450 into cockroachdb:master May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants