Skip to content

opt: reduce allocations in props.histogramIter#93743

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
mgartner:histogram-reduce-allocation-try-3
Dec 16, 2022
Merged

opt: reduce allocations in props.histogramIter#93743
craig[bot] merged 1 commit intocockroachdb:masterfrom
mgartner:histogram-reduce-allocation-try-3

Conversation

@mgartner
Copy link
Copy Markdown
Contributor

This commit reduces allocations in props.histogramIter by delaying datum allocations for bucket bounds until absolutely necessary. Instead, we prefer using the upper bound of the previous bucket as an exclusive lower bound of the current bucket because no datum allocation is necessary.

This significantly reduces the number allocations, especially when filtering a histogram by a span that includes many buckets. It reduces the number of bytes allocated, especially when the histogram bounds are larger values, such as strings larger than several bytes. The benefits can be observed in one of the optimizer micro-benchmarks:

name                                                                    old time/op    new time/op    delta
Phases/single-col-histogram-range/Simple/Parse-16                         13.1µs ±19%    10.5µs ± 2%  -19.66%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/OptBuildNoNorm-16                95.5µs ± 3%    77.4µs ±36%     ~     (p=0.190 n=4+5)
Phases/single-col-histogram-range/Simple/OptBuildNorm-16                   144µs ± 5%     108µs ± 3%  -24.76%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/Explore-16                        176µs ±17%     114µs ± 2%  -34.93%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/ExecBuild-16                      161µs ± 2%     121µs ± 1%  -24.61%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNoNorm-16    56.2µs ± 2%    41.8µs ± 5%  -25.69%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNorm-16      55.8µs ± 3%    42.1µs ± 3%  -24.53%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/Explore-16                     65.0µs ± 8%    50.5µs ± 5%  -22.24%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/ExecBuild-16                   73.0µs ± 3%    57.4µs ± 7%  -21.48%  (p=0.008 n=5+5)

name                                                                    old alloc/op   new alloc/op   delta
Phases/single-col-histogram-range/Simple/Parse-16                         3.25kB ± 0%    3.25kB ± 0%     ~     (all equal)
Phases/single-col-histogram-range/Simple/OptBuildNoNorm-16                33.5kB ± 0%    17.9kB ± 0%  -46.52%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/OptBuildNorm-16                  57.4kB ± 0%    26.2kB ± 0%  -54.37%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/Explore-16                       60.2kB ± 0%    29.0kB ± 0%  -51.81%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/ExecBuild-16                     61.7kB ± 0%    30.5kB ± 0%  -50.56%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNoNorm-16    26.0kB ± 0%    10.5kB ± 0%  -59.83%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNorm-16      26.0kB ± 0%    10.5kB ± 0%  -59.83%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/Explore-16                     28.3kB ± 0%    12.7kB ± 0%  -55.07%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/ExecBuild-16                   29.8kB ± 0%    14.2kB ± 0%  -52.32%  (p=0.008 n=5+5)

name                                                                    old allocs/op  new allocs/op  delta
Phases/single-col-histogram-range/Simple/Parse-16                           20.0 ± 0%      20.0 ± 0%     ~     (all equal)
Phases/single-col-histogram-range/Simple/OptBuildNoNorm-16                   608 ± 0%       282 ± 0%  -53.62%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/OptBuildNorm-16                   1.15k ± 0%     0.50k ± 0%  -56.55%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/Explore-16                        1.16k ± 0%     0.51k ± 0%  -56.06%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/ExecBuild-16                      1.17k ± 0%     0.52k ± 0%  -55.63%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNoNorm-16       564 ± 0%       238 ± 0%  -57.80%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNorm-16         564 ± 0%       238 ± 0%  -57.80%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/Explore-16                        573 ± 0%       247 ± 0%  -56.89%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/ExecBuild-16                      582 ± 0%       256 ± 0%  -56.01%  (p=0.008 n=5+5)

Fixes #89982

Epic: None

Release note: None

@mgartner mgartner requested review from michae2 and rytaft December 15, 2022 20:43
@mgartner mgartner requested a review from a team as a code owner December 15, 2022 20:43
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

│ ├── stats: [rows=3063.5, distinct(2)=11, null(2)=0]
│ │ histogram(2)= 0 0 0.0024693 3063.5
│ │ <--- NULL ----------- true
│ ├── stats: [rows=3063.5, distinct(2)=1, null(2)=0]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure exactly why this changed, but it looks like the distinct count is correctly 1 now instead of 11.

Copy link
Copy Markdown
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement!

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @michae2)


pkg/sql/opt/props/histogram.go line 603 at r1 (raw file):

func (hi *histogramIter) inclusiveUpperBound() tree.Datum {
	if hi.ub == nil {
		hi.ub = hi.h.getNextLowerBound(hi.eub)

this seems a bit odd -- the exclusive upper bound is actually less than the inclusive upper bound?

makes me slightly nervous about the call to PreferInclusive above

Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always great to see those allocations decrease!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @rytaft)


pkg/sql/opt/props/histogram.go line 338 at r1 (raw file):

			// to make it easier to work with.
			cmpStarts := filteredSpan.CompareStarts(&keyCtx, &left)
			filteredSpan.PreferInclusive(&keyCtx)

This doesn't work on all types, for example, decimals can't be handled at all and strings can only make the start bound inclusive. Is that handled by the following logic? If so, a comment would be helpful.


pkg/sql/opt/props/histogram.go line 603 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

this seems a bit odd -- the exclusive upper bound is actually less than the inclusive upper bound?

makes me slightly nervous about the call to PreferInclusive above

Right, do we want to be calling Prev here instead of Next to make it inclusive?

This commit reduces allocations in props.histogramIter by delaying datum
allocations for bucket bounds until absolutely necessary. Instead, we
prefer using the upper bound of the previous bucket as an exclusive
lower bound of the current bucket because no datum allocation is
necessary.

This significantly reduces the number allocations, especially when
filtering a histogram by a span that includes many buckets. It reduces
the number of bytes allocated, especially when the histogram bounds are
larger values, such as strings larger than several bytes. The benefits
can be observed in one of the optimizer micro-benchmarks:

```
name                                                                    old time/op    new time/op    delta
Phases/single-col-histogram-range/Simple/Parse-16                         13.1µs ±19%    10.5µs ± 2%  -19.66%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/OptBuildNoNorm-16                95.5µs ± 3%    77.4µs ±36%     ~     (p=0.190 n=4+5)
Phases/single-col-histogram-range/Simple/OptBuildNorm-16                   144µs ± 5%     108µs ± 3%  -24.76%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/Explore-16                        176µs ±17%     114µs ± 2%  -34.93%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/ExecBuild-16                      161µs ± 2%     121µs ± 1%  -24.61%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNoNorm-16    56.2µs ± 2%    41.8µs ± 5%  -25.69%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNorm-16      55.8µs ± 3%    42.1µs ± 3%  -24.53%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/Explore-16                     65.0µs ± 8%    50.5µs ± 5%  -22.24%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/ExecBuild-16                   73.0µs ± 3%    57.4µs ± 7%  -21.48%  (p=0.008 n=5+5)

name                                                                    old alloc/op   new alloc/op   delta
Phases/single-col-histogram-range/Simple/Parse-16                         3.25kB ± 0%    3.25kB ± 0%     ~     (all equal)
Phases/single-col-histogram-range/Simple/OptBuildNoNorm-16                33.5kB ± 0%    17.9kB ± 0%  -46.52%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/OptBuildNorm-16                  57.4kB ± 0%    26.2kB ± 0%  -54.37%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/Explore-16                       60.2kB ± 0%    29.0kB ± 0%  -51.81%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/ExecBuild-16                     61.7kB ± 0%    30.5kB ± 0%  -50.56%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNoNorm-16    26.0kB ± 0%    10.5kB ± 0%  -59.83%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNorm-16      26.0kB ± 0%    10.5kB ± 0%  -59.83%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/Explore-16                     28.3kB ± 0%    12.7kB ± 0%  -55.07%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/ExecBuild-16                   29.8kB ± 0%    14.2kB ± 0%  -52.32%  (p=0.008 n=5+5)

name                                                                    old allocs/op  new allocs/op  delta
Phases/single-col-histogram-range/Simple/Parse-16                           20.0 ± 0%      20.0 ± 0%     ~     (all equal)
Phases/single-col-histogram-range/Simple/OptBuildNoNorm-16                   608 ± 0%       282 ± 0%  -53.62%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/OptBuildNorm-16                   1.15k ± 0%     0.50k ± 0%  -56.55%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/Explore-16                        1.16k ± 0%     0.51k ± 0%  -56.06%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Simple/ExecBuild-16                      1.17k ± 0%     0.52k ± 0%  -55.63%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNoNorm-16       564 ± 0%       238 ± 0%  -57.80%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNorm-16         564 ± 0%       238 ± 0%  -57.80%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/Explore-16                        573 ± 0%       247 ± 0%  -56.89%  (p=0.008 n=5+5)
Phases/single-col-histogram-range/Prepared/ExecBuild-16                      582 ± 0%       256 ± 0%  -56.01%  (p=0.008 n=5+5)
```

Fixes cockroachdb#89982

Epic: None

Release note: None
@mgartner mgartner force-pushed the histogram-reduce-allocation-try-3 branch from 84ee141 to 2ae6439 Compare December 16, 2022 16:11
Copy link
Copy Markdown
Contributor Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @michae2, and @rytaft)


pkg/sql/opt/props/histogram.go line 338 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

This doesn't work on all types, for example, decimals can't be handled at all and strings can only make the start bound inclusive. Is that handled by the following logic? If so, a comment would be helpful.

PreferInclusive uses Datum.Next just like getNextLowerBound did, so I don't think anything has changed here. I've added a comment noting that this is best-effort. Let me know if you think it deserves more commentary.


pkg/sql/opt/props/histogram.go line 603 at r1 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Right, do we want to be calling Prev here instead of Next to make it inclusive?

The iter.desc=true cases are very confusing and it took me a while to understand them. But I do believe the behavior is the same as before.

hi.ub is only nil if iter.desc=true which means we're iterating over the buckets in reverse and the upper bounds are actually less than the lower bounds. For example, if the current bucket covers the range (/10 - /20] and iter.desc=true then hi.lb will be 20 and hi.eub will be 10. getNextLowerBound(10) is 11, so hi.ub becomes 11, correctly representing an inclusive range [/20 - /11].

I've added a handful of comments to better explain things.

Copy link
Copy Markdown
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DrewKimball and @michae2)


pkg/sql/opt/props/histogram.go line 603 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

The iter.desc=true cases are very confusing and it took me a while to understand them. But I do believe the behavior is the same as before.

hi.ub is only nil if iter.desc=true which means we're iterating over the buckets in reverse and the upper bounds are actually less than the lower bounds. For example, if the current bucket covers the range (/10 - /20] and iter.desc=true then hi.lb will be 20 and hi.eub will be 10. getNextLowerBound(10) is 11, so hi.ub becomes 11, correctly representing an inclusive range [/20 - /11].

I've added a handful of comments to better explain things.

thanks for the explanation!

Copy link
Copy Markdown
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @mgartner and @michae2)


pkg/sql/opt/props/histogram.go line 338 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

PreferInclusive uses Datum.Next just like getNextLowerBound did, so I don't think anything has changed here. I've added a comment noting that this is best-effort. Let me know if you think it deserves more commentary.

Thanks!

@mgartner
Copy link
Copy Markdown
Contributor Author

TFTRs!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 16, 2022

Build failed (retrying...):

Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Very nice!!

Reviewed 1 of 2 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 3 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @rytaft)

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Dec 16, 2022

Build succeeded:

@craig craig bot merged commit cc63f9f into cockroachdb:master Dec 16, 2022
@mgartner mgartner deleted the histogram-reduce-allocation-try-3 branch December 16, 2022 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql/opt: excessive allocations in histogram interactions

5 participants