-
Notifications
You must be signed in to change notification settings - Fork 589
[CORE] Separate gluten-core module into gluten-core and gluten-substrait #7031
Description
Description
Part of #6920 which is the final goal.
gluten-substrait will include Gluten's substrait definitions, and the abstract APIs of Substrait-based query plans. This could mean, most of the current plan nodes' base traits/classes that are coupled with Substrait will be moved into gluten-substrait. Also, some backend APIs that are related to Substrait, say SparkPlanExecApi may be factored out to have major of the code placed in this new module.
gluten-core will become a comparatively thin layer than before. The module will provide Spark listeners, task life cycle managers, query planner facilities, and memory consumer, metrics, config utilities, etc.
After the refactor, when a new library/backend is getting integrated into Gluten, it will not be strictly required to implement the complete Gluten Substrait protocol and will have its own flexibility to inject its computation acceleration capability into Gluten through its own best known way.
Meanwhile, gluten-data may be renamed to gluten-arrow for clearer demonstration of its purpose, which targets to provide pre-defined data sharing/transition utilities for arrow-compatible libraries. But this could be in another topic.