-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Describe the bug
Bucket shuffle join is a algorithm of joining two tables. Left table is distrubuted by a column. Right table sends the data to the left table for joining operation. It reduces the network cost. But when two table is withou any data. Bucket shuffle join will fail.
To Reproduce
Steps to reproduce the behavior:
- create two table:
CREATE TABLE dynamic_partition2 (
k1 bigint(20) NULL COMMENT "",
k2 int(11) NULL COMMENT "",
k3 smallint(6) NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(k1, k2, k3)
COMMENT "OLAP"
PARTITION BY RANGE(k1)
(PARTITION p20201219 VALUES [("20201219"), ("20201220")),
PARTITION p20201220 VALUES [("20201220"), ("20201221")),
PARTITION p20201221 VALUES [("20201221"), ("20201222")),
PARTITION p20201222 VALUES [("20201222"), ("20201223")),
PARTITION p20201223 VALUES [("20201223"), ("20201224")),
PARTITION p20201224 VALUES [("20201224"), ("20201225")),
PARTITION p20201225 VALUES [("20201225"), ("20201226")),
PARTITION p20201226 VALUES [("20201226"), ("20201227")),
PARTITION p20201227 VALUES [("20201227"), ("20201228")),
PARTITION p20201228 VALUES [("20201228"), ("20201229")),
PARTITION p20201229 VALUES [("20201229"), ("20201230")),
PARTITION p20201230 VALUES [("20201230"), ("20201231")),
PARTITION p20201231 VALUES [("20201231"), ("20210101")))
DISTRIBUTED BY HASH(k2) BUCKETS 32
PROPERTIES (
"replication_num" = "1",
"colocate_with" = "group1",
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "DAY",
"dynamic_partition.time_zone" = "Asia/Shanghai",
"dynamic_partition.start" = "-6",
"dynamic_partition.end" = "6",
"dynamic_partition.prefix" = "p",
"dynamic_partition.replication_num" = "1",
"dynamic_partition.buckets" = "32",
"in_memory" = "false",
"storage_format" = "V2"
);
CREATE TABLE dynamic_partition_bucket2 (
k1 bigint(20) NULL COMMENT "",
k2 int(11) NULL COMMENT "",
k3 smallint(6) NULL COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(k1, k2, k3)
COMMENT "OLAP"
PARTITION BY RANGE(k1)
(PARTITION p20201219 VALUES [("20201219"), ("20201220")),
PARTITION p20201220 VALUES [("20201220"), ("20201221")),
PARTITION p20201221 VALUES [("20201221"), ("20201222")),
PARTITION p20201222 VALUES [("20201222"), ("20201223")),
PARTITION p20201223 VALUES [("20201223"), ("20201224")),
PARTITION p20201224 VALUES [("20201224"), ("20201225")),
PARTITION p20201225 VALUES [("20201225"), ("20201226")),
PARTITION p20201226 VALUES [("20201226"), ("20201227")),
PARTITION p20201227 VALUES [("20201227"), ("20201228")),
PARTITION p20201228 VALUES [("20201228"), ("20201229")),
PARTITION p20201229 VALUES [("20201229"), ("20201230")),
PARTITION p20201230 VALUES [("20201230"), ("20201231")),
PARTITION p20201231 VALUES [("20201231"), ("20210101")))
DISTRIBUTED BY HASH(k2, k3) BUCKETS 32
PROPERTIES (
"replication_num" = "1",
"colocate_with" = "group3",
"dynamic_partition.enable" = "true",
"dynamic_partition.time_unit" = "DAY",
"dynamic_partition.time_zone" = "Asia/Shanghai",
"dynamic_partition.start" = "-6",
"dynamic_partition.end" = "6",
"dynamic_partition.prefix" = "p",
"dynamic_partition.replication_num" = "1",
"dynamic_partition.buckets" = "32",
"in_memory" = "false",
"storage_format" = "V2"
);
-
do not insert any data into two tables
-
enable bucket shuffle join
set enable_bucket_shuffle_join=true -
execute a sql
select * from dynamic_partition2 a join dynamic_partition_bucket2 b on a.k2 = b.k2 where a.k1=20201218;
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
Smartphone (please complete the following information):
- Device: [e.g. iPhone6]
- OS: [e.g. iOS8.1]
- Browser [e.g. stock browser, safari]
- Version [e.g. 22]
Additional context
Add any other context about the problem here.
