-
Notifications
You must be signed in to change notification settings - Fork 411
Closed
Labels
type/enhancementThe issue or PR belongs to an enhancement.The issue or PR belongs to an enhancement.
Description
Enhancement
This issue tracks Join&Aggregation operator utilizing fine grained partition tech to optimize its implementation.
For Join operator:
The time saved by utilizing fine grained partition tech in Prob side is much fewer than the cost brought by fine grained partition itself, so just consider optimize Build side. Besides, the optimization requires:
- Prob side uses the same (or compatible) hash function to choose build side stream, and uses original hash function to do the match work.
- The prehash key is the same as the Join hash table's hash key
For Agg operator:
Nothing special.
Related issue: #4631
TiDB
- Support fine grained parition Join & Aggregation in physical plan planner,infoschema,executor: Add tiflash fine grained shuffle support for hash join and aggregation tidb#40121
- Add column prune rule for exchange + join in physical plan phase planner: Add HashJoin<-Receiver specific physicalPlan column pruner tidb#38536
TiFlash
- Optimize FineGrainedShuffleWriter to reuse block memory Fine grained partition writer optimization #6173
- Optimize Receiver side decode implementation to include squashing functionality Exchange receiver decode optimization to do squashing work at the same time #6202
- Utilize Fine Grained Partition information to optimize Join & Aggregation Support using FineGrainedShuffle Info for Join&Agg #6279
- Gtests for utilizing Fine Grained Partition information to optimize Join & Aggregation Add unit tests for fine-grained join & agg #6445
After apply all these optimizations, TPCH_100 3 nodes, 2-tiflash replica, total execution time will reduce about 10%, 80.6 => 72.3:

Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
type/enhancementThe issue or PR belongs to an enhancement.The issue or PR belongs to an enhancement.