-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Support default value semantics in Iceberg #2039
Copy link
Copy link
Closed as not planned
Labels
Description
Hive tables written in Avro file format face a challenge when they are migrated to Iceberg due to the lack of support of default value semantics in Iceberg. If a field is assigned a default value, users get a null if the field is optional and get an exception (shown below) if the field is required, compared to getting a non-null default value when the default value in the Avro schema is non-null. This issue is to track work required to support default values in Iceberg. It will be a plus if the semantics and expectations remain the same regardless of the underlying file format (e.g., support should ideally extend to ORC as well).
java.lang.IllegalArgumentException: Missing required field: xyz
at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:217)
at org.apache.iceberg.avro.BuildAvroProjection.record(BuildAvroProjection.java:98)
at org.iceberg.avro.BuildAvroProjection.record(BuildAvroProjection.java:42)
at org.apache.iceberg.avro.AvroCustomOrderSchemaVisitor.visit(AvroCustomOrderSchemaVisitor.java:51)
at org.apache.iceberg.avro.AvroSchemaUtil.buildAvroProjection(AvroSchemaUtil.java:104)
at org.apache.iceberg.avro.ProjectionDatumReader.setSchema(ProjectionDatumReader.java:68)
at org.apache.iceberg.shaded.org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:132)
at org.apache.iceberg.shaded.org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:106)
at org.apache.iceberg.shaded.org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:98)
at org.apache.iceberg.shaded.org.apache.avro.file.DataFileReader.openReader(DataFileReader.java:66)
at org.apache.iceberg.avro.AvroIterable.newFileReader(AvroIterable.java:100)
at org.apache.iceberg.avro.AvroIterable.iterator(AvroIterable.java:77)
at org.apache.iceberg.spark.source.RowDataReader.open(RowDataReader.java:103)
at org.apache.iceberg.spark.source.BaseDataReader.next(BaseDataReader.java:81)
Reactions are currently unavailable