[C++] Abstract aggregation kernel API

Related to the particular details of implementing various aggregation types, we should first put a bit of energy into the abstract API for aggregating data in a multi-threaded setting

Aggregators must support both hash/group (e.g. "group by" in SQL or data frame libraries) modes and non-group modes. 

Aggregations ideally should also support filter pushdown. For example:

```Java

select $AGG($EXPR)
from $TABLE
where $PREDICATE
```

Some systems might materialize the post-predicate / filtered version of `$EXPR`, then aggregate that. pandas does this for example. Vectorized performance can be much improved by filtering inside the aggregation kernel. How the predicate true/false values are handled may depend on the implementation details of the kernel (e.g. SUM or MEAN will be a bit different from PRODUCT)

**Reporter**: [Wes McKinney](https://issues.apache.org/jira/browse/ARROW-4124) / @wesm
**Assignee**: [Francois Saint-Jacques](https://issues.apache.org/jira/browse/ARROW-4124) / @fsaintjacques
#### Related issues:
- [[C++] Parallelize execution of ScalarAggregateFunction](https://github.com/apache/arrow/issues/19473) (relates to)
- [[C++]  Mean kernel aggregate](https://github.com/apache/arrow/issues/19474) (relates to)
- [[C++] Incremental Count, Count Not Null aggregator](https://github.com/apache/arrow/issues/15782) (relates to)
- [[C++] Incremental Variance, Standard Deviation aggregators](https://github.com/apache/arrow/issues/19475) (relates to)
#### PRs and other links:
- [GitHub Pull Request #3407](https://github.com/apache/arrow/pull/3407)

<sub>**Note**: *This issue was originally created as [ARROW-4124](https://issues.apache.org/jira/browse/ARROW-4124). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[C++] Abstract aggregation kernel API #20713

Related issues:

PRs and other links:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[C++] Abstract aggregation kernel API #20713

Description

Related issues:

PRs and other links:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions