Skip to content

[VL] Result mismatch found in FlushableAgg #6630

@jiangjiangtian

Description

@jiangjiangtian

Backend

VL (Velox)

Bug description

I have a sql query that runs in gluten and vanilla spark, its format is as follows:

select count(*) from ((
    select *
    from test1
    where xxx
  )a
  left join
  (
    select col_a, col_b, col_c, col_d, col_e
    from test2
    where xxx
    group by col_a
            ,col_b
            ,col_c
            ,col_d
            ,col_e
  )b
ON a.col1 = b.col1);

I get different number of rows. And I look at the spark ui, I found the reason is that the numbers of rows of the second subquery don't match.
vanilla spark:
image

gluten:
image
image

Actually, I found that some rows are duplicate.
But when I just run the second subquery, I get the right result.
image
image
We can see the plan is different. The second hash aggregation is regular.

Besides, I set spark.gluten.sql.columnar.backend.velox.flushablePartialAggregation to false and I get the right result.
image
image

So I think there might be a bug for flushable hash aggregation or the plan conversion, but I can't find a small SQL to demonstrate the bug.
I'm sorry for not having a small example.

Spark version

3.0

Spark configurations

No response

System information

Velox System Info v0.0.2
Commit: 96712646c63bf4305cca4eaa7dfd26c2179547b1
CMake Version: 3.17.5
System: Linux-3.10.0-862.mt20190308.130.el7.x86_64
Arch: x86_64
CPU Name: Model name: Intel(R) Xeon(R) Platinum 8255C CPU @ 2.50GHz
C++ Compiler: /opt/rh/devtoolset-10/root/usr/bin/c++
C++ Compiler Version: 10.2.1
C Compiler: /opt/rh/devtoolset-10/root/usr/bin/cc
C Compiler Version: 10.2.1
CMake Prefix Path: /usr/local;/usr;/;/usr;/usr/local;/usr/X11R6;/usr/pkg;/opt

Relevant logs

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions