sql/opt/invertedexpr: avoid per-span allocations in GeoUnionKeySpansToSpanExpr by nvb · Pull Request #53215 · cockroachdb/cockroach

nvb · 2020-08-21T17:30:56Z

This commit optimizes a few heap allocations in GeoUnionKeySpansToSpanExpr and
in GeoRPKeyExprToSpanExpr to avoid per-span/per-expression heap allocations.

Before this change, these heap allocations accounted for 27.52% of all heap
allocations (by object, alloc_objects) in a geospatial query of interest:

Type: alloc_objects
Time: Aug 21, 2020 at 4:59pm (UTC)
Active filters:
   focus=geoKeyToEncInvertedVal
Showing nodes accounting for 61277094, 27.52% of 222698492 total
----------------------------------------------------------+-------------
      flat  flat%   sum%        cum   cum%   calls calls% + context
----------------------------------------------------------+-------------
                                          15270121 99.79% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:46
                                             32768  0.21% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:45
  15302889  6.87%  6.87%   15302889  6.87%                | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:31
----------------------------------------------------------+-------------
                                          16187639 53.52% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:46
                                          14057686 46.48% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:45
         0     0%  6.87%   30245325 13.58%                | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:32
                                          30245325   100% |   github.com/cockroachdb/cockroach/pkg/sql/sqlbase.EncodeTableKey /go/src/github.com/cockroachdb/cockroach/pkg/sql/sqlbase/column_type_encoding.go:75
----------------------------------------------------------+-------------
                                          15728880   100% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:46
         0     0%  6.87%   15728880  7.06%                | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:38
                                          15728880   100% |   github.com/cockroachdb/cockroach/pkg/roachpb.Key.PrefixEnd /go/src/github.com/cockroachdb/cockroach/pkg/roachpb/data.go:188
----------------------------------------------------------+-------------

These allocations were largely avoidable. This commit removes some of
them entirely and reworks the code structure in order to amortize the
unavoidable allocations over the set of keys being encoded. The result
is a large reduction in allocations. On the same workload, allocations
under geoKeyToEncInvertedVal dropped from 27.52% of the workload
to 2.23% of the workload:

Type: alloc_objects
Time: Aug 21, 2020 at 5:04pm (UTC)
Active filters:
   focus=geoKeyToEncInvertedVal
Showing nodes accounting for 3632555, 2.23% of 162651628 total
----------------------------------------------------------+-------------
      flat  flat%   sum%        cum   cum%   calls calls% + context
----------------------------------------------------------+-------------
                                           2014518 55.46% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:53
                                           1618037 44.54% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:54
         0     0%     0%    3632555  2.23%                | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:44
                                           3632555   100% |   github.com/cockroachdb/cockroach/pkg/util/encoding.EncodeUvarintAscending /go/src/github.com/cockroachdb/cockroach/pkg/util/encoding/encoding.go:375
----------------------------------------------------------+-------------

When applied on top of #53181, this results in a 1.71% speedup on
the following geospatial query:

SELECT Count(census.blkid), Count(subways.name)
FROM nyc_census_blocks AS census
JOIN nyc_subway_stations AS subways
ON ST_Intersects(subways.geom, census.geom);

name                                old ms    new ms    delta
Test/postgis_geometry_tutorial/nyc  136 ±12%  134 ±11%  -1.71%  (p=0.000 n=976+977)

In making this change, we also fix a bug where geoindex.Key (a uint64) was
being cast to a tree.DInt (an int64) during encoding. This could result in
overflow and could cause inverted key spans to be generated.

cockroach-teamcity · 2020-08-21T17:31:04Z

This change is

sumeerbhola

Reviewed 2 of 2 files at r1, 6 of 7 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @sumeerbhola)

pkg/sql/opt/invertedexpr/geo_expression.go, line 65 at r1 (raw file):

	}
	spans := make([]InvertedSpan, len(ukSpans))
	var b []byte // avoid per-span heap allocations

this is an interesting pattern to amortize allocation.

pkg/sql/opt/invertedexpr/geo_expression.go, line 95 at r1 (raw file):

		case geoindex.Key:
			var span InvertedSpan
			span, b = geoToSpan(geoindex.KeySpan{Start: e, End: e}, b)

I can't see a good reason why I wrote 2 loops instead of folding the earlier loop into this case. We don't need to generate the same span twice.
(this won't change your specific benchmark since it is not using an RPKeyExpr, but will help elsewhere)

pkg/sql/opt/xform/testdata/external/postgis-tutorial-idx, line 42 at r2 (raw file):

      │              ├── inverted expression: /6
      │              │    ├── tight: false
      │              │    └── union spans: ["\x87\xff", "\x87\xff"]

Looks like we had some cellids that looked like negative values close to zero.

pkg/sql/opt/xform/testdata/rules/join, line 3394 at r2 (raw file):

      │    │    │         ├── inverted expression: /18
      │    │    │         │    ├── tight: false
      │    │    │         │    └── union spans: ["\x89", "\xfd \x00\x00\x00\x00\x00\x00\x00")

didn't notice these spaces in these strings before -- must be because we are using strconv.Quote(string(span.Start)) in expression.go formatSpan. Are roachpb.Spans typically formatted purely as hex? That would be easier to read.

pkg/sql/opt/xform/testdata/rules/limit, line 1178 at r2 (raw file):

 │         │         ├── ["\xfd\x10\x00\x00\x00\x00\x00\x00\x00", "\xfd\x10\x00\x00\x00\x00\x00\x00\x00"]
 │         │         ├── ["\xfd\x14\x00\x00\x00\x00\x00\x00\x00", "\xfd\x14\x00\x00\x00\x00\x00\x00\x00"]
 │         │         └── ["\xfd\x14\x00\x00\x00\x00\x00\x00\x01", "\xfd\x16\x00\x00\x00\x00\x00\x00\x00")

I don't understand this change
With a "\xfd" prefix I think this is the IntMax path of EncodeUvarintAscending. So these must have been positive integers even after casting to int64. So why would anything change. Is this the effect of not using PrefixEnd -- I don't see how?

nvb

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)

pkg/sql/opt/invertedexpr/geo_expression.go, line 95 at r1 (raw file):

Previously, sumeerbhola wrote…

I can't see a good reason why I wrote 2 loops instead of folding the earlier loop into this case. We don't need to generate the same span twice.
(this won't change your specific benchmark since it is not using an RPKeyExpr, but will help elsewhere)

Mind if I leave that as a follow-up PR for you?

pkg/sql/opt/xform/testdata/external/postgis-tutorial-idx, line 42 at r2 (raw file):

Previously, sumeerbhola wrote…

Looks like we had some cellids that looked like negative values close to zero.

That's interesting. \x87 means that the values was in the range [-0xff, 0) when encoded. So this must have been overflowing an int64 when cast from a uint64.

pkg/sql/opt/xform/testdata/rules/join, line 3394 at r2 (raw file):

Previously, sumeerbhola wrote…

didn't notice these spaces in these strings before -- must be because we are using strconv.Quote(string(span.Start)) in expression.go formatSpan. Are roachpb.Spans typically formatted purely as hex? That would be easier to read.

Yes, typically spans are formatted using their String method, which attempts to pretty-print where possible, but falls back to hex when not possible. We might want to look into that here.

pkg/sql/opt/xform/testdata/rules/limit, line 1178 at r2 (raw file):

Is this the effect of not using PrefixEnd

Yes, I think it is. \xfd\x16 is not a valid varint encoding. However, it is the PrefixNext() of EncodeUvarintAscending(1657324662872342527) = \xfd\x15\xff\xff\xff\xff\xff\xff\xff. The value we see here is the result of EncodeUvarintAscending(1657324662872342528). So the difference here is due to the switch away from PrefixEnd. Since there are no valid keys between these two possible encodings, the difference is immaterial and either is a valid end key.

This commit updates batchedInvertedExprEvaluator to recycle the `[]invertedSpan` that it constructs in its init method, such that the same slice can be reused for each batch of rows in the `invertedJoiner`. Before this change, this heap allocations accounted for 5.51% of all heap allocations (by size, `alloc_space`) in a geospatial query of interest: ``` ----------------------------------------------------------+------------- 804MB 100% | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*invertedJoiner).readInput /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_joiner.go:418 804MB 5.51% 36.69% 804MB 5.51% | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*batchedInvertedExprEvaluator).init /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_expr_evaluator.go:407 ----------------------------------------------------------+------------- ``` When applied on top of cockroachdb#53181 and cockroachdb#53215, this results in a 0.4% speedup on the following geospatial query: ```sql SELECT Count(census.blkid), Count(subways.name) FROM nyc_census_blocks AS census JOIN nyc_subway_stations AS subways ON ST_Intersects(subways.geom, census.geom); ``` ``` name old ms new ms delta Test/postgis_geometry_tutorial/nyc 134 ±12% 133 ±12% -0.40% (p=0.006 n=976+979) ```

53237: sql/rowexec: recycle coveringSpans in batchedInvertedExprEvaluator r=nvanbenschoten a=nvanbenschoten This commit updates batchedInvertedExprEvaluator to recycle the []invertedSpan that it constructs in its init method, such that the same slice can be reused for each batch of rows in the `invertedJoiner`. Before this change, this heap allocations accounted for 5.51% of all heap allocations (by size, `alloc_space`) in a geospatial query of interest: ``` ----------------------------------------------------------+------------- 804MB 100% | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*invertedJoiner).readInput /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_joiner.go:418 804MB 5.51% 36.69% 804MB 5.51% | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*batchedInvertedExprEvaluator).init /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_expr_evaluator.go:407 ----------------------------------------------------------+------------- ``` When applied on top of #53181 and #53215, this results in a **0.4%** speedup on the following geospatial query: ```sql SELECT Count(census.blkid), Count(subways.name) FROM nyc_census_blocks AS census JOIN nyc_subway_stations AS subways ON ST_Intersects(subways.geom, census.geom); ``` ``` name old ms new ms delta Test/postgis_geometry_tutorial/nyc 134 ±12% 133 ±12% -0.40% (p=0.006 n=976+979) ``` Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>

sumeerbhola

Thanks for doing this!

Reviewed 1 of 7 files at r2, 5 of 5 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @sumeerbhola)

pkg/sql/opt/invertedexpr/geo_expression.go, line 95 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Mind if I leave that as a follow-up PR for you?

Yep, I'll do it.

pkg/sql/opt/invertedexpr/geo_expression.go, line 44 at r3 (raw file):

	}
	prev := len(b)
	b = encoding.EncodeUvarintAscending(b, uint64(k))

can you add a comment like
// Set the capacity so that the caller appending to enc does not corrupt later keys.

pkg/sql/rowenc/index_encoding.go, line 850 at r3 (raw file):

}

func encodeGeoKeys(inKey []byte, geoKeys []geoindex.Key) (keys [][]byte, err error) {

looking at this again, just to completely satisfy my curiosity
Whenever b is not large enough it will need to be reallocated, so say it grows as m, 2m, 4m, 8m, 16m, where m is the length added in each loop iteration. Each time it will incur a copy, so m + 2m + 4m + 8m =15m bytes will be copied by the time it grows to 16m. So aside from the copying that is performed in the append(b, inKey...), with p loop iterations and total final length n this will incur n bytes copied and log p allocations. Correct? The old code also incurred n bytes copied since it did not allocate enough in the outKey := make([]byte, len(inKey)), and incurred p allocations. We could have changed the old code to incur no copies by over-allocating based on the worst case encoded length of a DInt (would break abstraction a bit).

I assume your experience is that the copying cost of the pattern you have changed this to, to not be much of a detractor, yes?

pkg/sql/rowenc/index_encoding.go, line 856 at r3 (raw file):

		prev := len(b)
		b = append(b, inKey...)
		b = encoding.EncodeUvarintAscending(b, uint64(k))

can you add a comment like
// Set the capacity so that the caller appending to a key in keys does not corrupt later keys.

@otan

52972: cli: command to deserialize descriptors r=spaskob a=spaskob We currently export the descriptors and job payloads in debug zips with hex encoding. It is often very valuable to decode these protos into JSON for inspection. Fixes #52063. Release note (cli change): The new debug command `decode-proto` reads descriptor from stdin in hex or base64 format (auto-detected) and a flag --schema=\<fully qualified name to decode\> with default value `cockroach.sql.sqlbase.Descriptor` and outputs to stdout the deserialized proto in JSON format. 53226: colexec: make hash table more dynamic and fix hash join in some cases r=yuzefovich a=yuzefovich **colexec: make hash table more dynamic** This commit replaces all fixed allocations of constant size in the hash table in favor of allocating the slices dynamically which makes the hash table more dynamic as well. This allows us to place the logic of clearing up those slices for the next iteration in a single place which simplifies the code a bit. Additionally it moves `buckets` slice that is only used by the hash joiner out of the hash table. Release note: None **colexec: fix LEFT ANTI hash join when right eq cols are key** Release note (bug fix): Previously, CockroachDB could return incorrect results when performing LEFT ANTI hash join when right equality columns form a key when it was executed via the vectorized engine, and now this has been fixed. **colexec: fix set-op hash joins and optimize them a bit** This commit fixes several problems with set-operation hash joins: 1. similar to how LEFT ANTI hash joins are handled, INTERSECT ALL and EXCEPT ALL hash joins rely on `headID` to be populated. That step is skipped if the right side is distinct (meaning right equality columns form a key). This would result in incorrect output, and we now override `rightDistinct` flag to get the desired behavior (probably sub-optimal but correct). 2. actually, before we would get an incorrect result, we would hit an out of bounds error because `hashTable.visited` would not have been created previously. That slice is used by set-op joins to track which tuples from the right side have already been "deleted" (i.e. they were used for a match). This is now also fixed. However, currently the information whether the right equality columns form a key is not propagated for set-operation joins, so the bug would not have occurred in the non-test environment. Additionally, this commit optimizes the handling of LEFT ANTI and EXCEPT ALL joins by not allocating `same` slice because those two join types have separate `collectAnti*` methods that don't use that slice. Release note: None 53248: release: fix adding libgeos files to .tar.gz and compress .zip r=jlinder a=otan * Fix issue where only the first file was added to a .tar.gz. * Compress files better when adding to zips. Release note: None 53252: sql/kv: optimize heap allocations for inverted joins and scans with many spans r=nvanbenschoten a=nvanbenschoten This PR contains a collection of optimizations that build on #53181, #53215, and #53237 to reduce the number and size of heap allocations performed by inverted joins. Combined, these changes results in a **1.46%** speedup on the following geospatial query: ```sql SELECT Count(census.blkid), Count(subways.name) FROM nyc_census_blocks AS census JOIN nyc_subway_stations AS subways ON ST_Intersects(subways.geom, census.geom); ``` ``` name old ms new ms delta Test/postgis_geometry_tutorial/nyc 144 ±15% 142 ±16% -1.46% (p=0.000 n=972+974) ``` ---- ### kv: reserve lock spans in SpanSet ahead of time In #52610, we merged the latch and lock span declared by requests. Doing so is the long term goal in this area, but it doesn't seem like that's going to be possible for v20.2. Meanwhile, the collection of lock spans is currently less efficient than the collection of latch spans. This is because we have a heuristic to eagerly allocate lock span slices ahead of time, instead of letting them grow as they are appended to. This optimization does not exist for lock spans. We can see this by looking at an alloc_space heap profile while running a geospatial query of interest. We see that the lock spans account for **9.76%** of all allocated space while the latch spans account for only **2.65%** of all allocated space. ``` ----------------------------------------------------------+------------- 646.37MB 100% | github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval.DefaultDeclareIsolatedKeys /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval/declare.go:90 0 0% 100% 646.37MB 9.76% | github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset.(*SpanSet).AddNonMVCC /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset/spanset.go:127 646.37MB 100% | github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset.(*SpanSet).AddMVCC /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset/spanset.go:139 ----------------------------------------------------------+------------- 175.37MB 100% | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).collectSpans /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/replica_send.go:691 175.37MB 2.65% 61.51% 175.37MB 2.65% | github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset.(*SpanSet).Reserve /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset/spanset.go:120 ----------------------------------------------------------+------------- ``` To improve this, we can do the same ahead of time slice resizing for lock spans by reserving an estimate for the number and type of lock spans we expect a request to declare. This guess may be a little less accurate than our guess for latch spans, because there are a few request types that declare latch spans but not lock spans, but these are rare requests that are much less important to optimize than the requests that do declare a latch and a lock span. With this change, we shave off about **6.57%** of total allocated space in this workload. We no longer see allocations in `SpanSet.AddNonMVCC` and the amount of heap memory allocated by `SpanSet.Reserve` roughly doubles, as we would expect. ``` ----------------------------------------------------------+------------- 190.99MB 53.34% | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).collectSpans /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/replica_send.go:692 167.09MB 46.66% | github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).collectSpans /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/replica_send.go:693 358.08MB 5.84% 36.53% 358.08MB 5.84% | github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset.(*SpanSet).Reserve /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset/spanset.go:120 ----------------------------------------------------------+------------- ``` ### sql/row: batch allocate RequestUnion wrapper structs in txnKVFetcher.fetch A Request's approach towards representing a union type results in double indirection. For instance, a ScanRequest is represented like: ``` Request{Value: &RequestUnion_Scan{Scan: &ScanRequest{}}} ``` The code in txnKVFetcher.fetch was batch allocating the inner struct in this structure across all scans in a batch, but was not batch allocating the outer "union" struct. The effect of this is that the outer struct's heap allocation showed up as 4.78% of the alloc_objects heap profile of an interesting geospatial query: ``` ----------------------------------------------------------+------------- 2621460 100% | github.com/cockroachdb/cockroach/pkg/sql/row.(*txnKVFetcher).fetch /go/src/github.com/cockroachdb/cockroach/pkg/sql/row/kv_batch_fetcher.go:297 0 0% 100% 2621460 4.78% | github.com/cockroachdb/cockroach/pkg/roachpb.(*RequestUnion).MustSetInner /go/src/github.com/cockroachdb/cockroach/pkg/roachpb/api.go:604 2621460 100% | github.com/cockroachdb/cockroach/pkg/roachpb.(*RequestUnion).SetInner /go/src/github.com/cockroachdb/cockroach/pkg/roachpb/batch_generated.go:354 ----------------------------------------------------------+------------- ``` This commit avoids this source of per-request allocations by batch allocating these union structs across all requests in a batch just like the code was previously batch allocating the inner structs. ### sql/rowexec: avoid allocating slice header on sort in fragmentPendingSpans batchedInvertedExprEvaluator's fragmentPendingSpans performs a `sort.Slice` to sort a list of pending spans by their end key. `sort.Slice` is convenient, but it has some trade-offs. For one, it uses reflection to mimic parameterization over all slice types. Second, it accepts the slice as an interface, which causes the slice header to escape to the heap. The effect of this is that the call was showing up as 4.35% of the alloc_objects heap profile of an interesting geospatial query: ``` ----------------------------------------------------------+------------- 2387773 100% | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*batchedInvertedExprEvaluator).init /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_expr_evaluator.go:406 2162754 3.94% 40.58% 2387773 4.35% | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*batchedInvertedExprEvaluator).fragmentPendingSpans /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_expr_evaluator.go:284 225019 9.42% | sort.Slice /usr/local/go/src/sort/slice.go:15 ----------------------------------------------------------+------------- ``` This commit avoids this heap allocation by assigning the slice to a field on the batchedInvertedExprEvaluator (which was already on the heap) and using the old `sort.Sort` function. ---- @otan an area I didn't touch but wanted to talk to you about was the memory representation of `DGeography` and `DGeometry`. Specifically, I'm wondering why we're embedding pointers instead of values inside of these datums, effectively creating a double-boxing situation. I see that most of the geo operators work off pointer and that the "canonical" representation of the objects exported by `pkg/geo` is a pointer (e.g. the package exports "New..." methods instead of "Make..." methods). I don't understand why this is though. Both `Geography` and `Geometry` are only 40 bytes large, which is well within reason to pass on the stack. I ask because this makes working with these datums more expensive than strictly necessary. We make an effort to batch allocate datum objects themselves (see `DatumAlloc`), but don't do anything special with internal heap allocations. I suspect that there would be a moderate win here if we eliminated the double-boxing on any query that scans a large number of rows containing one of these data types, as I'm seeing the heap allocations in DecodeUntaggedDatum fairly prominently in profiles. Co-authored-by: Spas Bojanov <spas@cockroachlabs.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Oliver Tan <otan@cockroachlabs.com> Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>

nvb

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)

pkg/sql/opt/invertedexpr/geo_expression.go, line 44 at r3 (raw file):

Previously, sumeerbhola wrote…

can you add a comment like
// Set the capacity so that the caller appending to enc does not corrupt later keys.

Done.

pkg/sql/rowenc/index_encoding.go, line 850 at r3 (raw file):
That's a very interesting point. My experience is that the copying cost is far outweighed by the allocator cost in these kinds of situations.

You could also make an argument that this approach is over-allocating by about 100%, in addition to the copying cost. But I think the previous approach would also average out to over-allocating by 100%. So I think that's actually a wash.

We could have changed the old code to incur no copies by over-allocating based on the worst case encoded length of a DInt (would break abstraction a bit).

The worse-case encoded length of a DInt is 9 bytes. Do you have an idea of the distribution of values for geoindex.Key? If the distribution is uniform then we'd expect to use 9 bytes 50% of the time, 8 bytes 25% of the time, etc. That would mean that over-allocating upfront would almost certainly be worth it.

The same argument could be made for geoKeyToEncInvertedVal and its callers.

pkg/sql/rowenc/index_encoding.go, line 856 at r3 (raw file):

Previously, sumeerbhola wrote…

can you add a comment like
// Set the capacity so that the caller appending to a key in keys does not corrupt later keys.

Done.

…oSpanExpr This commit optimizes a few heap allocations in `GeoUnionKeySpansToSpanExpr` and in `GeoRPKeyExprToSpanExpr` to avoid per-span/per-expression heap allocations. Before this change, these heap allocations accounted for 27.52% of all heap allocations (by object, `alloc_objects`) in a geospatial query of interest: ``` Type: alloc_objects Time: Aug 21, 2020 at 4:59pm (UTC) Active filters: focus=geoKeyToEncInvertedVal Showing nodes accounting for 61277094, 27.52% of 222698492 total ----------------------------------------------------------+------------- flat flat% sum% cum cum% calls calls% + context ----------------------------------------------------------+------------- 15270121 99.79% | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:46 32768 0.21% | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:45 15302889 6.87% 6.87% 15302889 6.87% | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:31 ----------------------------------------------------------+------------- 16187639 53.52% | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:46 14057686 46.48% | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:45 0 0% 6.87% 30245325 13.58% | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:32 30245325 100% | github.com/cockroachdb/cockroach/pkg/sql/sqlbase.EncodeTableKey /go/src/github.com/cockroachdb/cockroach/pkg/sql/sqlbase/column_type_encoding.go:75 ----------------------------------------------------------+------------- 15728880 100% | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:46 0 0% 6.87% 15728880 7.06% | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:38 15728880 100% | github.com/cockroachdb/cockroach/pkg/roachpb.Key.PrefixEnd /go/src/github.com/cockroachdb/cockroach/pkg/roachpb/data.go:188 ----------------------------------------------------------+------------- ``` These allocations were largely avoidable. This commit removes some of them entirely and reworks the code structure in order to amortize the unavoidable allocations over the set of keys being encoded. The result is a large reduction in allocations. On the same workload, allocations under `geoKeyToEncInvertedVal` dropped from **27.52%** of the workload to **2.23%** of the workload: ``` Type: alloc_objects Time: Aug 21, 2020 at 5:04pm (UTC) Active filters: focus=geoKeyToEncInvertedVal Showing nodes accounting for 3632555, 2.23% of 162651628 total ----------------------------------------------------------+------------- flat flat% sum% cum cum% calls calls% + context ----------------------------------------------------------+------------- 2014518 55.46% | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:53 1618037 44.54% | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:54 0 0% 0% 3632555 2.23% | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:44 3632555 100% | github.com/cockroachdb/cockroach/pkg/util/encoding.EncodeUvarintAscending /go/src/github.com/cockroachdb/cockroach/pkg/util/encoding/encoding.go:375 ----------------------------------------------------------+------------- ``` When applied on top of cockroachdb#53181, this results in a **1.71%** speedup on the following geospatial query: ```sql SELECT Count(census.blkid), Count(subways.name) FROM nyc_census_blocks AS census JOIN nyc_subway_stations AS subways ON ST_Intersects(subways.geom, census.geom); ``` ``` name old ms new ms delta Test/postgis_geometry_tutorial/nyc 136 ±12% 134 ±11% -1.71% (p=0.000 n=976+977) ``` In making this change, we also fix a bug where `geoindex.Key` (a uint64) was being cast to a `tree.DInt` (an int64) during encoding. This could result in overflow and could cause inverted key spans to be generated.

sumeerbhola

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @sumeerbhola)

pkg/sql/rowenc/index_encoding.go, line 850 at r3 (raw file):

If the distribution is uniform then we'd expect to use 9 bytes 50% of the time, 8 bytes 25% of the time, etc. That would mean that over-allocating upfront would almost certainly be worth it.

The distribution should be uniform. Geometry (not geography) will have 00 as the leading 2 bits, but that is still close enough.

…panExpr The distribution of `geoindex.Key` values is expected to be uniform, which means that the number of bytes used for the varint encoding is expected to be at or near the maximum length with high probability. This commit capitalizes on this to create reasonable estimates for the total buffer size needed in encodeGeoKeys and GeoUnionKeySpansToSpanExpr to avoid needing to resize the slices in these functions.

nvb

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @sumeerbhola)

pkg/sql/rowenc/index_encoding.go, line 850 at r3 (raw file):

Previously, sumeerbhola wrote…

If the distribution is uniform then we'd expect to use 9 bytes 50% of the time, 8 bytes 25% of the time, etc. That would mean that over-allocating upfront would almost certainly be worth it.

The distribution should be uniform. Geometry (not geography) will have 00 as the leading 2 bits, but that is still close enough.

Done (in a second commit). This should avoid the need to ever resize slices encodeGeoKeys or GeoUnionKeySpansToSpanExpr.

I don't have a benchmark environment set up right now. Are you ok merging this as is or would you like me to get numbers on this change first?

sumeerbhola

Reviewed 6 of 6 files at r5.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @sumeerbhola)

pkg/sql/rowenc/index_encoding.go, line 850 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Done (in a second commit). This should avoid the need to ever resize slices encodeGeoKeys or GeoUnionKeySpansToSpanExpr.

I don't have a benchmark environment set up right now. Are you ok merging this as is or would you like me to get numbers on this change first?

Merging is fine.

nvb · 2020-08-24T22:08:40Z

TFTR!

bors r+

craig · 2020-08-24T23:33:18Z

Build failed (retrying...):

GitHub CI (Cockroach)

craig · 2020-08-25T01:34:04Z

Build succeeded:

GitHub CI (Cockroach)

nvb requested a review from sumeerbhola August 21, 2020 17:30

nvb requested a review from a team as a code owner August 21, 2020 17:30

nvb force-pushed the nvanbenschoten/geoKeyToEncInvertedValOpt branch from deff603 to e4fbed2 Compare August 21, 2020 18:50

sumeerbhola requested changes Aug 21, 2020

View reviewed changes

nvb commented Aug 21, 2020

View reviewed changes

nvb mentioned this pull request Aug 21, 2020

sql/rowexec: recycle coveringSpans in batchedInvertedExprEvaluator #53237

Merged

nvb mentioned this pull request Aug 22, 2020

sql/kv: optimize heap allocations for inverted joins and scans with many spans #53252

Merged

nvb force-pushed the nvanbenschoten/geoKeyToEncInvertedValOpt branch 3 times, most recently from d976ffa to f53a295 Compare August 23, 2020 19:30

sumeerbhola approved these changes Aug 24, 2020

View reviewed changes

nvb commented Aug 24, 2020

View reviewed changes

sumeerbhola approved these changes Aug 24, 2020

View reviewed changes

nvb force-pushed the nvanbenschoten/geoKeyToEncInvertedValOpt branch from f53a295 to 6ae5c97 Compare August 24, 2020 19:47

nvb commented Aug 24, 2020

View reviewed changes

sumeerbhola approved these changes Aug 24, 2020

View reviewed changes

craig bot merged commit 72cb52d into cockroachdb:master Aug 25, 2020

nvb deleted the nvanbenschoten/geoKeyToEncInvertedValOpt branch August 29, 2020 21:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql/opt/invertedexpr: avoid per-span allocations in GeoUnionKeySpansToSpanExpr#53215

sql/opt/invertedexpr: avoid per-span allocations in GeoUnionKeySpansToSpanExpr#53215
craig[bot] merged 2 commits intocockroachdb:masterfrom
nvb:nvanbenschoten/geoKeyToEncInvertedValOpt

nvb commented Aug 21, 2020

Uh oh!

cockroach-teamcity commented Aug 21, 2020

Uh oh!

sumeerbhola left a comment

Uh oh!

nvb left a comment

Uh oh!

sumeerbhola left a comment

Uh oh!

nvb left a comment

Uh oh!

sumeerbhola left a comment

Uh oh!

nvb left a comment

Uh oh!

sumeerbhola left a comment

Uh oh!

nvb commented Aug 24, 2020

Uh oh!

craig bot commented Aug 24, 2020

Uh oh!

craig bot commented Aug 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nvb commented Aug 21, 2020

Uh oh!

cockroach-teamcity commented Aug 21, 2020

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

nvb left a comment

Choose a reason for hiding this comment

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

nvb commented Aug 24, 2020

Uh oh!

craig bot commented Aug 24, 2020

Uh oh!

craig bot commented Aug 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants