-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Open
Labels
enhancementNew feature or requestNew feature or requestsubstraitChanges to the substrait crateChanges to the substrait crate
Description
Is your feature request related to a problem or challenge?
A report from Twitter https://twitter.com/mim_djo/status/1740542585410814393
Says:
a new release of #datafusion 34, still reading #Deltatable via arrow is suboptimal compared to reading Parquet Directly :( something to do with passing stats to get correct join orders.
I think the issue is that #7949 and #7950 rely on statistics to pick non bad join orders for TPCH queries.
These statistics are not available from the delta provider it seems.
@andygrove says
RelCommon (common to all operators in Substrait) can contain a hint that has stats
message Stats {
double row_count = 1;
double record_size = 2;
substrait.extensions.AdvancedExtension advanced_extension = 10;
}
Describe the solution you'd like
I would like the Datafusion substrait consumer/producer to handle translating
Describe alternatives you've considered
No response
Additional context
This was brought up by @Dandandan on the ASF slack: https://the-asf.slack.com/archives/C04RJ0C85UZ/p1703885214702039
Dandandan
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestsubstraitChanges to the substrait crateChanges to the substrait crate
