Skip to content

[experiment] sql: increase size of FastIntSet's stack-allocated segment #72733

@nvb

Description

@nvb

Currently, FastIntSet can store integer values from 0 to 63 in its "small" stack-allocated segment. Once a value exceeds this limit, the set begins to use its "large" heap-allocated segment, which incurs heap allocations. The structure is used throughout the SQL optimizer to store column IDs.

We see in TPC-E that a large chunk of queries use more than 63 columns, so they hit this allocation regime. Worse, once these queries hit the allocation regime, many intermediate FastIntSet operations begin allocating all at once, because the API is designed with value semantics, so it treats inputs as immutable. This is very pronounced in CPU and heap profiles. For instance, with an alloc_objects sample from a heap profile, we see that over 8.5% of all heap allocations in CRDB are due to FastIntSet manipulation, making (*FastIntSet).toLarge and (FastIntSet).Copy the two largest sources of heap allocations in the workload:

----------------------------------------------------------+-------------
      flat  flat%   sum%        cum   cum%   calls calls% + context 	 	 
----------------------------------------------------------+-------------
                                         524033092 73.04% |   github.com/cockroachdb/cockroach/pkg/util.(*FastIntSet).Add /go/src/github.com/cockroachdb/cockroach/pkg/util/fast_int_set.go:75
                                         136724537 19.06% |   github.com/cockroachdb/cockroach/pkg/util.(*FastIntSet).UnionWith /go/src/github.com/cockroachdb/cockroach/pkg/util/fast_int_set.go:221
                                          56659299  7.90% |   github.com/cockroachdb/cockroach/pkg/util.(*FastIntSet).DifferenceWith /go/src/github.com/cockroachdb/cockroach/pkg/util/fast_int_set.go:278
 717416928  4.40%  4.40%  717416928  4.40%                | github.com/cockroachdb/cockroach/pkg/util.(*FastIntSet).toLarge /go/src/github.com/cockroachdb/cockroach/pkg/util/fast_int_set.go:50
----------------------------------------------------------+-------------
                                         414941755 61.21% |   github.com/cockroachdb/cockroach/pkg/sql/opt.ColSet.Copy /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/colset.go:58
                                          98228029 14.49% |   github.com/cockroachdb/cockroach/pkg/util.FastIntSet.Union /go/src/github.com/cockroachdb/cockroach/pkg/util/fast_int_set.go:234
                                          95368850 14.07% |   github.com/cockroachdb/cockroach/pkg/util.FastIntSet.Intersection /go/src/github.com/cockroachdb/cockroach/pkg/util/fast_int_set.go:255
                                          69324899 10.23% |   github.com/cockroachdb/cockroach/pkg/util.FastIntSet.Difference /go/src/github.com/cockroachdb/cockroach/pkg/util/fast_int_set.go:283
 677863533  4.16%  8.56%  677863533  4.16%                | github.com/cockroachdb/cockroach/pkg/util.FastIntSet.Copy /go/src/github.com/cockroachdb/cockroach/pkg/util/fast_int_set.go:190
----------------------------------------------------------+-------------

Screen Shot 2021-11-14 at 2 48 56 PM

Experiment

We should experiment with increasing the size of the stack-allocated segment of the FastIntSet. Replacing the uint64 with a uint128.Uint128 would allow the set to store column IDs all the way up to 127 without allocating. We could then test this out on TPC-E to see how many queries still exceed this limit. I'd guess it would be very few, so we could reasonably expect that this change would reduce total heap allocations in the workload by somewhere around 8%.

Metadata

Metadata

Assignees

Labels

A-sql-optimizerSQL logical planning and optimizations.C-performancePerf of queries or internals. Solution not expected to change functional behavior.T-sql-queriesSQL Queries Team

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions