Skip to content

Conversation

@lidavidm
Copy link
Member

@lidavidm lidavidm commented Aug 4, 2021

This reuses the Grouper to add kernels for count_distinct and distinct. The implementation for distinct is very simple and not particularly efficient, leveraging Concatenate to build the output list. Neither kernel handles larger-than-memory processing.

@github-actions
Copy link

github-actions bot commented Aug 4, 2021

@lidavidm
Copy link
Member Author

lidavidm commented Aug 4, 2021

Actually, if we want to keep the distinct kernel, either the grouper could be adjusted to build the output list directly, or we should use Take instead of Concatenate.

@lidavidm
Copy link
Member Author

lidavidm commented Aug 4, 2021

Changed to use Take() to assemble the final results instead of Concatenate().

Copy link
Member

@bkietz bkietz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good, just a few high level comments and nits

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's annoying.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michalursa what do you think of adding Grouper::Reserve(additional_capacity_hint)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The keys here are pairs of [value, group_id] so just having the number of groups doesn't give much of a capacity hint.

@bkietz
Copy link
Member

bkietz commented Aug 23, 2021

@lidavidm needs rebase

@lidavidm
Copy link
Member Author

Rebased, thanks for the heads up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: (value, group_id) pairs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Thanks, fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants