Skip to content

colexecagg: fix memory accounting for decimals with avg and sum#57318

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:fix-agg-acc
Dec 2, 2020
Merged

colexecagg: fix memory accounting for decimals with avg and sum#57318
craig[bot] merged 1 commit intocockroachdb:masterfrom
yuzefovich:fix-agg-acc

Conversation

@yuzefovich
Copy link
Copy Markdown
Member

@yuzefovich yuzefovich commented Dec 1, 2020

Previously, we were not calling PerformOperation to account for the
memory used by the decimals (which are of variable size), and this is
now fixed. Additionally, this commit refactors several other templates to
be more similar to each other.

Note that we decided to call PerformOperation on all of the aggregate
functions even if we know that all types that the functions operate on
are of fixed size and that call is not strictly necessary. However, this
has negligible impact on performance in most cases and makes the layout
of Compute methods of all functions the same ensuring that we don't
forget these calls in the future.

Additionally, this commit adds capturing of the column to force bounds
check to work for min/max and sum aggregates.

Release note: None

@yuzefovich yuzefovich requested review from a team and asubiotto December 1, 2020 19:17
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@yuzefovich
Copy link
Copy Markdown
Member Author

Thoughts on calling PerformOperation in all cases? It'll somewhat unnecessarily increase the size of some aggregates (which matters in case of hash aggregation with large number of groups), but we'll get uniformity so that we don't need to think whether we need to call the method or not.

@yuzefovich
Copy link
Copy Markdown
Member Author

Benchmarks are here.

@yuzefovich
Copy link
Copy Markdown
Member Author

Ran the benchmarks if we call PerformOperation for all aggregate functions, and they show up to 10% drop on small datasets (mostly in case of hash aggregation). I probably lean towards making memory accounting more robust and accepting that perf hit. Thoughts?

Copy link
Copy Markdown
Contributor

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouldn't compromise on stability for performance.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)

Previously, we were not calling `PerformOperation` to account for the
memory used by the decimals (which are of variable size), and this is
now fixed. Additionally, this commit refactors several other templates to
be more similar to each other.

Note that we decided to call `PerformOperation` on all of the aggregate
functions even if we know that all types that the functions operate on
are of fixed size and that call is not strictly necessary. However, this
has negligible impact on performance in most cases and makes the layout
of `Compute` methods of all functions the same ensuring that we don't
forget these calls in the future.

Additionally, this commit adds capturing of the column to force bounds
check to work for min/max and sum aggregates.

Release note: None
@yuzefovich
Copy link
Copy Markdown
Member Author

I agree, updated the commit. The newest benchmarks are here.

TFTR!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 2, 2020

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 2, 2020

Build succeeded:

@craig craig bot merged commit d3f73b9 into cockroachdb:master Dec 2, 2020
@yuzefovich yuzefovich deleted the fix-agg-acc branch December 2, 2020 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants