-
Notifications
You must be signed in to change notification settings - Fork 4.5k
[Feature Request]: Support Zstd codec in SerializableAvroCodecFactory #32349
Copy link
Copy link
Closed
Milestone
Description
What would you like to happen?
Avro 1.9+ supports ZSTD compression codec. I tried to use org.apache.avro.file.CodecFactory.zstandardCodec(3) in my Beam GenericRecord write, but ran into the following exception from SerializableAvroCodecFactory:
Exception in thread "main" java.lang.IllegalStateException: zstandard[3] is not supported
at org.apache.beam.vendor.guava.v32_1_2_jre.com.google.common.base.Preconditions.checkState(Preconditions.java:601)
at org.apache.beam.sdk.extensions.avro.io.SerializableAvroCodecFactory.<init>(SerializableAvroCodecFactory.java:60)
at org.apache.beam.sdk.extensions.avro.io.AvroIO$TypedWrite.withCodec(AvroIO.java:1695)
at org.apache.beam.sdk.extensions.avro.io.AvroIO$Write.withCodec(AvroIO.java:1923)
Is there any reason that zstd isn't supported? If not, I can add it to the list of allowed formats in SerializableAvroCodecFactory.
My guess is that's due to cross-avro-version compatibility, since DataFileConstants.ZSTANDARD_CODEC doesn't exist in Avro 1.8, but we can just hardcode that as a String rather than importing the Avro library variable to preserve compatibility with Avro 1.8.
Issue Priority
Priority: 2 (default / most feature requests should be filed as P2)
Issue Components
- Component: Python SDK
- Component: Java SDK
- Component: Go SDK
- Component: Typescript SDK
- Component: IO connector
- Component: Beam YAML
- Component: Beam examples
- Component: Beam playground
- Component: Beam katas
- Component: Website
- Component: Infrastructure
- Component: Spark Runner
- Component: Flink Runner
- Component: Samza Runner
- Component: Twister2 Runner
- Component: Hazelcast Jet Runner
- Component: Google Cloud Dataflow Runner
Reactions are currently unavailable