-
Notifications
You must be signed in to change notification settings - Fork 588
[VL] Result mismatch found in FlushableAgg #6630
Description
Backend
VL (Velox)
Bug description
I have a sql query that runs in gluten and vanilla spark, its format is as follows:
select count(*) from ((
select *
from test1
where xxx
)a
left join
(
select col_a, col_b, col_c, col_d, col_e
from test2
where xxx
group by col_a
,col_b
,col_c
,col_d
,col_e
)b
ON a.col1 = b.col1);I get different number of rows. And I look at the spark ui, I found the reason is that the numbers of rows of the second subquery don't match.
vanilla spark:

Actually, I found that some rows are duplicate.
But when I just run the second subquery, I get the right result.


We can see the plan is different. The second hash aggregation is regular.
Besides, I set spark.gluten.sql.columnar.backend.velox.flushablePartialAggregation to false and I get the right result.


So I think there might be a bug for flushable hash aggregation or the plan conversion, but I can't find a small SQL to demonstrate the bug.
I'm sorry for not having a small example.
Spark version
3.0
Spark configurations
No response
System information
Velox System Info v0.0.2
Commit: 96712646c63bf4305cca4eaa7dfd26c2179547b1
CMake Version: 3.17.5
System: Linux-3.10.0-862.mt20190308.130.el7.x86_64
Arch: x86_64
CPU Name: Model name: Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
C++ Compiler: /opt/rh/devtoolset-10/root/usr/bin/c++
C++ Compiler Version: 10.2.1
C Compiler: /opt/rh/devtoolset-10/root/usr/bin/cc
C Compiler Version: 10.2.1
CMake Prefix Path: /usr/local;/usr;/;/usr;/usr/local;/usr/X11R6;/usr/pkg;/opt
Relevant logs
No response

