Previous implementation of hash group by converts input ExecBatches to row-oriented format,
then hashes and compares rows as if they were a single column.
It is more efficient (especially for small number of key columns) to avoid relatively costly
encoding and instead compute hashes of individual columns in column-oriented format mixing them together, and similarly comparing column-oriented data to row-oriented data in the hash table without converting.
Encoding only happens for a subset of input rows that are inserted into the hash table - they introduce new groups.
Keys in hash table remain stored as row-oriented.
Reporter: Michal Nowakiewicz / @michalursa
Assignee: Michal Nowakiewicz / @michalursa
Related issues:
PRs and other links:
Note: This issue was originally created as ARROW-12725. Please see the migration documentation for further details.