Unable to use GlueCatalog in flink environments without hadoop

When attempting to use the GlueCatalog implementation (or really any implementation) in flink, hadoop is expected to be in 
the classpath.

The [FlinkCatalogFactory](https://github.com/apache/iceberg/blob/4eb0853cd787bf2f5778195558d22a45ecf6c601/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java) always [attempts to load](https://github.com/apache/iceberg/blob/4eb0853cd787bf2f5778195558d22a45ecf6c601/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalogFactory.java#L118) the hadoop config from flink but flink does not guarantee that there is a valid hadoop environment present.  In environments where hadoop is not available (e.g. AWS Kinesis Data Analytics), this throws `java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration`.

Presently, most of the catalog implementations implement `Configurable` and thus the util functions like [loadCatalog](https://github.com/apache/iceberg/blob/4eb0853cd787bf2f5778195558d22a45ecf6c601/core/src/main/java/org/apache/iceberg/CatalogUtil.java#L170) expect to be passed an instance of hadoopConf.  In catalogs like GlueCatalog and DynamoCatalog, the only reason for the `Configurable` interface is to enable [dynamic FileIO loading](https://github.com/apache/iceberg/blob/4eb0853cd787bf2f5778195558d22a45ecf6c601/aws/src/main/java/org/apache/iceberg/aws/glue/GlueCatalog.java#L110)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to use GlueCatalog in flink environments without hadoop #3044

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to use GlueCatalog in flink environments without hadoop #3044

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions