opt: reduce allocations in Histogram.DistinctValuesCount#93482
opt: reduce allocations in Histogram.DistinctValuesCount#93482craig[bot] merged 4 commits intocockroachdb:masterfrom
Conversation
6a3e95a to
aa832e1
Compare
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r1, 1 of 1 files at r2, 1 of 1 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rharding6373)
pkg/sql/opt/props/histogram.go line 188 at r4 (raw file):
// possible to determine a finite value (which is the case for all types other // than integers and dates). func maxDistinctValuesInRange(lowerBound, upperBound tree.Datum) (n float64, ok bool) {
Not directly related to this PR, but should we be using tree.MaxDistinctCount here?
aa832e1 to
feed550
Compare
|
Previously, DrewKimball (Drew Kimball) wrote…
Ya we probably should. I've left a TODO and will look into in a follow-up PR if I get a chance. |
rharding6373
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r1, 1 of 1 files at r2, 1 of 1 files at r3, 1 of 1 files at r5, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball)
Setting the default session settings makes the micro-benchmarks more representative of real-world workloads. Epic: None Release note: None
Histograms and histogram types are now cached in the test catalog. This prevents computation and allocations made while building these histograms from dominating profiles in optimizer micro-benchmarks. Epic: None Release note: None
Epic: None Release note: None
This commit reduces allocations in `Histogram.DistinctValuesCount`. When calculating the maximum number of distinct values in a histogram bucket, we no longer allocate a datum representing the lower bound and then calculate the number of distinct values in the range [lowerBound, upperBound). Instead, we use the previous bucket's upper bound and calculate the number of distinct values in the range [previousUpperBound, upperBound) and subtract one to exclude the previous bucket's upper bound. This should significantly reduce allocations for some workloads, including YCSB/E. The benefits can be observed in one of the optimizer micro-benchmarks: ``` name old time/op new time/op delta Phases/single-col-histogram-range/Simple/Parse-16 10.7µs ± 6% 10.3µs ± 2% -4.00% (p=0.032 n=5+5) Phases/single-col-histogram-range/Simple/OptBuildNoNorm-16 87.9µs ± 5% 75.6µs ± 2% -13.99% (p=0.008 n=5+5) Phases/single-col-histogram-range/Simple/OptBuildNorm-16 138µs ± 7% 115µs ± 6% -16.65% (p=0.008 n=5+5) Phases/single-col-histogram-range/Simple/Explore-16 146µs ± 6% 125µs ± 8% -14.76% (p=0.008 n=5+5) Phases/single-col-histogram-range/Simple/ExecBuild-16 154µs ± 5% 143µs ± 4% -7.57% (p=0.008 n=5+5) Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNoNorm-16 56.4µs ± 2% 47.7µs ± 2% -15.41% (p=0.008 n=5+5) Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNorm-16 55.7µs ± 1% 47.0µs ± 1% -15.59% (p=0.008 n=5+5) Phases/single-col-histogram-range/Prepared/Explore-16 63.5µs ± 2% 53.7µs ± 1% -15.43% (p=0.008 n=5+5) Phases/single-col-histogram-range/Prepared/ExecBuild-16 69.1µs ± 1% 60.7µs ± 3% -12.23% (p=0.008 n=5+5) name old alloc/op new alloc/op delta Phases/single-col-histogram-range/Simple/Parse-16 3.25kB ± 0% 3.25kB ± 0% ~ (all equal) Phases/single-col-histogram-range/Simple/OptBuildNoNorm-16 48.0kB ± 0% 33.5kB ± 0% -30.24% (p=0.008 n=5+5) Phases/single-col-histogram-range/Simple/OptBuildNorm-16 86.4kB ± 0% 57.4kB ± 0% -33.62% (p=0.008 n=5+5) Phases/single-col-histogram-range/Simple/Explore-16 89.2kB ± 0% 60.2kB ± 0% -32.55% (p=0.008 n=5+5) Phases/single-col-histogram-range/Simple/ExecBuild-16 90.8kB ± 0% 61.7kB ± 0% -32.02% (p=0.008 n=5+5) Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNoNorm-16 40.6kB ± 0% 26.0kB ± 0% -35.78% (p=0.008 n=5+5) Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNorm-16 40.6kB ± 0% 26.0kB ± 0% -35.78% (p=0.008 n=5+5) Phases/single-col-histogram-range/Prepared/Explore-16 42.8kB ± 0% 28.3kB ± 0% -33.90% (p=0.008 n=5+5) Phases/single-col-histogram-range/Prepared/ExecBuild-16 44.3kB ± 0% 29.8kB ± 0% -32.76% (p=0.008 n=5+5) name old allocs/op new allocs/op delta Phases/single-col-histogram-range/Simple/Parse-16 20.0 ± 0% 20.0 ± 0% ~ (all equal) Phases/single-col-histogram-range/Simple/OptBuildNoNorm-16 911 ± 0% 608 ± 0% -33.26% (p=0.008 n=5+5) Phases/single-col-histogram-range/Simple/OptBuildNorm-16 1.76k ± 0% 1.15k ± 0% -34.45% (p=0.008 n=5+5) Phases/single-col-histogram-range/Simple/Explore-16 1.77k ± 0% 1.16k ± 0% -34.26% (p=0.008 n=5+5) Phases/single-col-histogram-range/Simple/ExecBuild-16 1.78k ± 0% 1.17k ± 0% -34.08% (p=0.008 n=5+5) Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNoNorm-16 867 ± 0% 564 ± 0% -34.95% (p=0.008 n=5+5) Phases/single-col-histogram-range/Prepared/AssignPlaceholdersNorm-16 867 ± 0% 564 ± 0% -34.95% (p=0.008 n=5+5) Phases/single-col-histogram-range/Prepared/Explore-16 876 ± 0% 573 ± 0% -34.59% (p=0.008 n=5+5) Phases/single-col-histogram-range/Prepared/ExecBuild-16 885 ± 0% 582 ± 0% -34.24% (p=0.008 n=5+5) ``` Informs cockroachdb#89982 Epic: None Release note: None
feed550 to
753b327
Compare
|
bors r+ |
|
Build succeeded: |
opt: set default session settings in micro-benchmarks
Setting the default session settings makes the micro-benchmarks more
representative of real-world workloads.
Epic: None
Release note: None
opt: cache histograms in test catalog
Histograms and histogram types are now cached in the test catalog. This
prevents computation and allocations made while building these
histograms from dominating profiles in optimizer micro-benchmarks.
Epic: None
Release note: None
opt: add benchmark for single-column table with histogram
Epic: None
Release note: None
opt: reduce allocations in Histogram.DistinctValuesCount
This commit reduces allocations in
Histogram.DistinctValuesCount.When calculating the maximum number of distinct values in a histogram
bucket, we no longer allocate a datum representing the lower bound and
then calculate the number of distinct values in the range
[lowerBound, upperBound). Instead, we use the previous bucket's upper
bound and calculate the number of distinct values in the range
[previousUpperBound, upperBound) and subtract one to exclude the
previous bucket's upper bound.
This should significantly reduce allocations for some workloads,
including YCSB/E. The benefits can be observed in one of the optimizer
micro-benchmarks:
Informs #89982
Epic: None
Release note: None