Skip to content

Unable to use GlueCatalog in flink environments without hadoop #3044

@kainoa21

Description

@kainoa21

When attempting to use the GlueCatalog implementation (or really any implementation) in flink, hadoop is expected to be in
the classpath.

The FlinkCatalogFactory always attempts to load the hadoop config from flink but flink does not guarantee that there is a valid hadoop environment present. In environments where hadoop is not available (e.g. AWS Kinesis Data Analytics), this throws java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration.

Presently, most of the catalog implementations implement Configurable and thus the util functions like loadCatalog expect to be passed an instance of hadoopConf. In catalogs like GlueCatalog and DynamoCatalog, the only reason for the Configurable interface is to enable dynamic FileIO loading

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions