Skip to content

colexec: hash aggregator doesn't maintain the partial ordering when spilling to disk #63159

@yuzefovich

Description

@yuzefovich

Currently, the vectorized hash aggregator doesn't maintain the partial ordering if it has to spill to disk. Consider the following logic test which will fail on fakedist-disk config:

statement ok
create table ab (a int, b int, index(a) storing (b));
insert into ab values (1,1),(3,3),(2,2),(5,5),(0,0),(1,1);

query III
select a, b, count(*) from ab group by a,b order by a
----
0  0  1
1  1  2
2  2  1
3  3  1
5  5  1

The issue is present only on 21.1 since before this release we didn't have the disk spilling support. There are several possible ways to mitigate this problem, and as the first step I will look into supporting the partial ordering by the external hash aggregator.

Metadata

Metadata

Assignees

Labels

C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.GA-blocker

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions