util: optimize FastIntSet#86053
Conversation
Release note: None
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 1 of 1 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @nvanbenschoten)
pkg/util/fast_int_set.go line 103 at r2 (raw file):
// added to both the large and small sets. func (s *FastIntSet) Add(i int) { withinSmallBounds := i >= 0 && i < smallCutoff
[nit] This condition can be inlined now.
pkg/util/fast_int_set.go line 170 at r2 (raw file):
// Len returns the number of the elements in the set. func (s FastIntSet) Len() int { len := s.small.OnesCount()
[nit] feels weird to use len as a variable name; could you use length instead?
`util.FastIntSet` now stores values in the range `[0, 128)` only in its
`small` bitmap, and never in its `large *intsets.Sparse`. This better
utilizes memory allocated for `large` and can significantly reduce
allocations in some cases. Prior to this change, the set `{0, 200, 300}`
would require two allocations. After this change, only a single
allocation is required. See the comments added in the code for more
details.
This change also reduces the complexity of `fitsInSmall` which could
account for up to 16% of CPU-time in some optimizer benchmarks. This
function previously used `intsets.Sparse.Min` and `intsets.Sparse.Max`
whenever `large` was allocated, which accounted for the majority of
CPU-time spent in `fitsInSmall`. Now, `fitsInSmall` can simply check
whether `large` is empty, which is a cheaper operation.
Release note: None
Previously, `FastIntSet.Copy` and `FastIntSet.CopyFrom` would always allocate new `large` sets if the set being copied had a non-nil `large` set. However, this set could be non-nil and empty in cases were elements were removed from a set, e.g., with `Remove` or `IntersectionWith`. This caused unnecessary allocations for `large` sets that ended up being empty. Now, the `large` set is only allocated if necessary. Release note: None
`FastIntSet.toLarge` is now so simple that it doesn't provide any significant benefit to the readability of the code. It has been removed. `FastIntSet.DifferenceWith` now avoids a potential allocation if `rhs` has a nil `large` set. Release note: None
24aa613 to
e22a528
Compare
mgartner
left a comment
There was a problem hiding this comment.
Have you run the slow queries benchmark to see if there's any difference there?
They show some improvement:
name old time/op new time/op delta
SlowQueries/slow-query-1-16 35.6ms ± 4% 31.9ms ± 9% -10.38% (p=0.008 n=5+5)
SlowQueries/slow-query-2-16 597ms ± 5% 552ms ± 2% -7.56% (p=0.008 n=5+5)
name old alloc/op new alloc/op delta
SlowQueries/slow-query-1-16 13.3MB ± 0% 12.9MB ± 0% -2.71% (p=0.008 n=5+5)
SlowQueries/slow-query-2-16 181MB ± 0% 181MB ± 0% ~ (p=0.310 n=5+5)
name old allocs/op new allocs/op delta
SlowQueries/slow-query-1-16 147k ± 0% 141k ± 0% -3.84% (p=0.008 n=5+5)
SlowQueries/slow-query-2-16 1.65M ± 0% 1.65M ± 0% -0.01% (p=0.008 n=5+5)
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @michae2, and @nvanbenschoten)
pkg/util/fast_int_set.go line 103 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit] This condition can be inlined now.
Done.
pkg/util/fast_int_set.go line 170 at r2 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
[nit] feels weird to use
lenas a variable name; could you uselengthinstead?
Heh, oops. Changed to l.
|
bors r+ |
|
Build failed: |
|
bors r+ |
|
Build succeeded: |
This PR contains several commits which speed up
util.FastIntSetand reduce theallocations it makes in some cases. See the individual commit messages for
details.
The speedup and reduction in allocations can be observed in the optimizer's
BenchmarkPhases: