Skip to content

Conversation

@lidavidm
Copy link
Member

@lidavidm lidavidm commented Aug 6, 2021

Pandas calls this 'size'. I opted not to extend ScalarAggregateOptions as the option wouldn't apply to any other kernel.

@github-actions
Copy link

github-actions bot commented Aug 6, 2021

@pitrou
Copy link
Member

pitrou commented Aug 9, 2021

Hmm... It's weird that Count takes ScalarAggregateOptions, yet 1) completely ignores min_count 2) has an unexpected interpretation of skip_nulls.

Can we instead make Count take a CountOptions that would have a enum describing whether one wants to 1) count non-nulls 2) count nulls 3) count everything?

@lidavidm
Copy link
Member Author

Rebased, fixed conflicts, added CountOptions, added GLib/Python/R bindings for CountOptions.

@lidavidm lidavidm changed the title ARROW-13574: [C++] Add count_all and hash_count_all kernels ARROW-13574: [C++] Add 'count all' option to count kernels Aug 10, 2021
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments, looks good in general.

/// Count only non-null values.
NON_NULL = 0,
/// Count only null values.
NULLS,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but the plural / singular is a bit inconsistent. Perhaps ONLY_NULL vs ONLY_VALID (or NULLS vs NON_NULLS, or NULL_VALUES vs VALID_VALUES...)?

*out = Datum(state.nulls);
break;
case CountOptions::ALL:
*out = Datum(state.non_nulls + state.nulls);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, it's probably not necessary to popcount the null bitmap?


const FunctionDoc count_doc{"Count the number of null / non-null values",
("By default, only non-null values are counted.\n"
"This can be changed through ScalarAggregateOptions."),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"CountOptions"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants