Skip to content

[C++] Implement Hash Aggregation query execution node #21501

@asfimport

Description

@asfimport

Dear all,

I wonder what the best way forward is for implementing GroupBy kernels. Initially this was part of

https://issues.apache.org/jira/browse/ARROW-4124

but is not contained in the current implementation as far as I can tell.

It seems that the part of group by that just returns indices could be conveniently implemented with the HashKernel. That seems useful in any case. Is that indeed the best way forward/should this be done?

GroupBy + Aggregate could then either be implemented with that + the Take kernel + aggregation involving more memory copies than necessary though or as part of the aggregate kernel. Probably the latter is preferred, any thoughts on that?

Am I missing any other JIRAs related to this?

Best, Philipp.

Reporter: Philipp Moritz / @pcmoritz

Related issues:

Note: This issue was originally created as ARROW-5002. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions