Skip to content

sql/opt/invertedexpr: avoid per-span allocations in GeoUnionKeySpansToSpanExpr#53215

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
nvb:nvanbenschoten/geoKeyToEncInvertedValOpt
Aug 25, 2020
Merged

sql/opt/invertedexpr: avoid per-span allocations in GeoUnionKeySpansToSpanExpr#53215
craig[bot] merged 2 commits intocockroachdb:masterfrom
nvb:nvanbenschoten/geoKeyToEncInvertedValOpt

Conversation

@nvb
Copy link
Copy Markdown
Contributor

@nvb nvb commented Aug 21, 2020

This commit optimizes a few heap allocations in GeoUnionKeySpansToSpanExpr and
in GeoRPKeyExprToSpanExpr to avoid per-span/per-expression heap allocations.

Before this change, these heap allocations accounted for 27.52% of all heap
allocations (by object, alloc_objects) in a geospatial query of interest:

Type: alloc_objects
Time: Aug 21, 2020 at 4:59pm (UTC)
Active filters:
   focus=geoKeyToEncInvertedVal
Showing nodes accounting for 61277094, 27.52% of 222698492 total
----------------------------------------------------------+-------------
      flat  flat%   sum%        cum   cum%   calls calls% + context
----------------------------------------------------------+-------------
                                          15270121 99.79% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:46
                                             32768  0.21% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:45
  15302889  6.87%  6.87%   15302889  6.87%                | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:31
----------------------------------------------------------+-------------
                                          16187639 53.52% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:46
                                          14057686 46.48% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:45
         0     0%  6.87%   30245325 13.58%                | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:32
                                          30245325   100% |   github.com/cockroachdb/cockroach/pkg/sql/sqlbase.EncodeTableKey /go/src/github.com/cockroachdb/cockroach/pkg/sql/sqlbase/column_type_encoding.go:75
----------------------------------------------------------+-------------
                                          15728880   100% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:46
         0     0%  6.87%   15728880  7.06%                | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:38
                                          15728880   100% |   github.com/cockroachdb/cockroach/pkg/roachpb.Key.PrefixEnd /go/src/github.com/cockroachdb/cockroach/pkg/roachpb/data.go:188
----------------------------------------------------------+-------------

These allocations were largely avoidable. This commit removes some of
them entirely and reworks the code structure in order to amortize the
unavoidable allocations over the set of keys being encoded. The result
is a large reduction in allocations. On the same workload, allocations
under geoKeyToEncInvertedVal dropped from 27.52% of the workload
to 2.23% of the workload:

Type: alloc_objects
Time: Aug 21, 2020 at 5:04pm (UTC)
Active filters:
   focus=geoKeyToEncInvertedVal
Showing nodes accounting for 3632555, 2.23% of 162651628 total
----------------------------------------------------------+-------------
      flat  flat%   sum%        cum   cum%   calls calls% + context
----------------------------------------------------------+-------------
                                           2014518 55.46% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:53
                                           1618037 44.54% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:54
         0     0%     0%    3632555  2.23%                | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:44
                                           3632555   100% |   github.com/cockroachdb/cockroach/pkg/util/encoding.EncodeUvarintAscending /go/src/github.com/cockroachdb/cockroach/pkg/util/encoding/encoding.go:375
----------------------------------------------------------+-------------

When applied on top of #53181, this results in a 1.71% speedup on
the following geospatial query:

SELECT Count(census.blkid), Count(subways.name)
FROM nyc_census_blocks AS census
JOIN nyc_subway_stations AS subways
ON ST_Intersects(subways.geom, census.geom);
name                                old ms    new ms    delta
Test/postgis_geometry_tutorial/nyc  136 ±12%  134 ±11%  -1.71%  (p=0.000 n=976+977)

In making this change, we also fix a bug where geoindex.Key (a uint64) was
being cast to a tree.DInt (an int64) during encoding. This could result in
overflow and could cause inverted key spans to be generated.

@nvb nvb requested a review from sumeerbhola August 21, 2020 17:30
@nvb nvb requested a review from a team as a code owner August 21, 2020 17:30
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@nvb nvb force-pushed the nvanbenschoten/geoKeyToEncInvertedValOpt branch from deff603 to e4fbed2 Compare August 21, 2020 18:50
Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1, 6 of 7 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten and @sumeerbhola)


pkg/sql/opt/invertedexpr/geo_expression.go, line 65 at r1 (raw file):

	}
	spans := make([]InvertedSpan, len(ukSpans))
	var b []byte // avoid per-span heap allocations

this is an interesting pattern to amortize allocation.


pkg/sql/opt/invertedexpr/geo_expression.go, line 95 at r1 (raw file):

		case geoindex.Key:
			var span InvertedSpan
			span, b = geoToSpan(geoindex.KeySpan{Start: e, End: e}, b)

I can't see a good reason why I wrote 2 loops instead of folding the earlier loop into this case. We don't need to generate the same span twice.
(this won't change your specific benchmark since it is not using an RPKeyExpr, but will help elsewhere)


pkg/sql/opt/xform/testdata/external/postgis-tutorial-idx, line 42 at r2 (raw file):

      │              ├── inverted expression: /6
      │              │    ├── tight: false
      │              │    └── union spans: ["\x87\xff", "\x87\xff"]

Looks like we had some cellids that looked like negative values close to zero.


pkg/sql/opt/xform/testdata/rules/join, line 3394 at r2 (raw file):

      │    │    │         ├── inverted expression: /18
      │    │    │         │    ├── tight: false
      │    │    │         │    └── union spans: ["\x89", "\xfd \x00\x00\x00\x00\x00\x00\x00")

didn't notice these spaces in these strings before -- must be because we are using strconv.Quote(string(span.Start)) in expression.go formatSpan. Are roachpb.Spans typically formatted purely as hex? That would be easier to read.


pkg/sql/opt/xform/testdata/rules/limit, line 1178 at r2 (raw file):

 │         │         ├── ["\xfd\x10\x00\x00\x00\x00\x00\x00\x00", "\xfd\x10\x00\x00\x00\x00\x00\x00\x00"]
 │         │         ├── ["\xfd\x14\x00\x00\x00\x00\x00\x00\x00", "\xfd\x14\x00\x00\x00\x00\x00\x00\x00"]
 │         │         └── ["\xfd\x14\x00\x00\x00\x00\x00\x00\x01", "\xfd\x16\x00\x00\x00\x00\x00\x00\x00")

I don't understand this change
With a "\xfd" prefix I think this is the IntMax path of EncodeUvarintAscending. So these must have been positive integers even after casting to int64. So why would anything change. Is this the effect of not using PrefixEnd -- I don't see how?

Copy link
Copy Markdown
Contributor Author

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)


pkg/sql/opt/invertedexpr/geo_expression.go, line 95 at r1 (raw file):

Previously, sumeerbhola wrote…

I can't see a good reason why I wrote 2 loops instead of folding the earlier loop into this case. We don't need to generate the same span twice.
(this won't change your specific benchmark since it is not using an RPKeyExpr, but will help elsewhere)

Mind if I leave that as a follow-up PR for you?


pkg/sql/opt/xform/testdata/external/postgis-tutorial-idx, line 42 at r2 (raw file):

Previously, sumeerbhola wrote…

Looks like we had some cellids that looked like negative values close to zero.

That's interesting. \x87 means that the values was in the range [-0xff, 0) when encoded. So this must have been overflowing an int64 when cast from a uint64.


pkg/sql/opt/xform/testdata/rules/join, line 3394 at r2 (raw file):

Previously, sumeerbhola wrote…

didn't notice these spaces in these strings before -- must be because we are using strconv.Quote(string(span.Start)) in expression.go formatSpan. Are roachpb.Spans typically formatted purely as hex? That would be easier to read.

Yes, typically spans are formatted using their String method, which attempts to pretty-print where possible, but falls back to hex when not possible. We might want to look into that here.


pkg/sql/opt/xform/testdata/rules/limit, line 1178 at r2 (raw file):

Is this the effect of not using PrefixEnd

Yes, I think it is. \xfd\x16 is not a valid varint encoding. However, it is the PrefixNext() of EncodeUvarintAscending(1657324662872342527) = \xfd\x15\xff\xff\xff\xff\xff\xff\xff. The value we see here is the result of EncodeUvarintAscending(1657324662872342528). So the difference here is due to the switch away from PrefixEnd. Since there are no valid keys between these two possible encodings, the difference is immaterial and either is a valid end key.

nvb added a commit to nvb/cockroach that referenced this pull request Aug 21, 2020
This commit updates batchedInvertedExprEvaluator to recycle the `[]invertedSpan`
that it constructs in its init method, such that the same slice can be reused
for each batch of rows in the `invertedJoiner`.

Before this change, this heap allocations accounted for 5.51% of all heap
allocations (by size, `alloc_space`) in a geospatial query of interest:

```
----------------------------------------------------------+-------------
                                             804MB   100% |   github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*invertedJoiner).readInput /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_joiner.go:418
     804MB  5.51% 36.69%      804MB  5.51%                | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*batchedInvertedExprEvaluator).init /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_expr_evaluator.go:407
----------------------------------------------------------+-------------
```

When applied on top of cockroachdb#53181 and cockroachdb#53215, this results in a 0.4% speedup on
the following geospatial query:

```sql
SELECT Count(census.blkid), Count(subways.name)
FROM nyc_census_blocks AS census
JOIN nyc_subway_stations AS subways
ON ST_Intersects(subways.geom, census.geom);
```

```
name                                old ms    new ms    delta
Test/postgis_geometry_tutorial/nyc  134 ±12%  133 ±12%  -0.40%  (p=0.006 n=976+979)
```
nvb added a commit to nvb/cockroach that referenced this pull request Aug 22, 2020
This commit updates batchedInvertedExprEvaluator to recycle the `[]invertedSpan`
that it constructs in its init method, such that the same slice can be reused
for each batch of rows in the `invertedJoiner`.

Before this change, this heap allocations accounted for 5.51% of all heap
allocations (by size, `alloc_space`) in a geospatial query of interest:

```
----------------------------------------------------------+-------------
                                             804MB   100% |   github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*invertedJoiner).readInput /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_joiner.go:418
     804MB  5.51% 36.69%      804MB  5.51%                | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*batchedInvertedExprEvaluator).init /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_expr_evaluator.go:407
----------------------------------------------------------+-------------
```

When applied on top of cockroachdb#53181 and cockroachdb#53215, this results in a 0.4% speedup on
the following geospatial query:

```sql
SELECT Count(census.blkid), Count(subways.name)
FROM nyc_census_blocks AS census
JOIN nyc_subway_stations AS subways
ON ST_Intersects(subways.geom, census.geom);
```

```
name                                old ms    new ms    delta
Test/postgis_geometry_tutorial/nyc  134 ±12%  133 ±12%  -0.40%  (p=0.006 n=976+979)
```
craig bot pushed a commit that referenced this pull request Aug 22, 2020
53237: sql/rowexec: recycle coveringSpans in batchedInvertedExprEvaluator r=nvanbenschoten a=nvanbenschoten

This commit updates batchedInvertedExprEvaluator to recycle the []invertedSpan
that it constructs in its init method, such that the same slice can be reused
for each batch of rows in the `invertedJoiner`.

Before this change, this heap allocations accounted for 5.51% of all heap
allocations (by size, `alloc_space`) in a geospatial query of interest:

```
----------------------------------------------------------+-------------
                                             804MB   100% |   github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*invertedJoiner).readInput /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_joiner.go:418
     804MB  5.51% 36.69%      804MB  5.51%                | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*batchedInvertedExprEvaluator).init /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_expr_evaluator.go:407
----------------------------------------------------------+-------------
```

When applied on top of #53181 and #53215, this results in a **0.4%** speedup on
the following geospatial query:

```sql
SELECT Count(census.blkid), Count(subways.name)
FROM nyc_census_blocks AS census
JOIN nyc_subway_stations AS subways
ON ST_Intersects(subways.geom, census.geom);
```

```
name                                old ms    new ms    delta
Test/postgis_geometry_tutorial/nyc  134 ±12%  133 ±12%  -0.40%  (p=0.006 n=976+979)
```

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@nvb nvb force-pushed the nvanbenschoten/geoKeyToEncInvertedValOpt branch 3 times, most recently from d976ffa to f53a295 Compare August 23, 2020 19:30
Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this!
:lgtm:

Reviewed 1 of 7 files at r2, 5 of 5 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @sumeerbhola)


pkg/sql/opt/invertedexpr/geo_expression.go, line 95 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Mind if I leave that as a follow-up PR for you?

Yep, I'll do it.


pkg/sql/opt/invertedexpr/geo_expression.go, line 44 at r3 (raw file):

	}
	prev := len(b)
	b = encoding.EncodeUvarintAscending(b, uint64(k))

can you add a comment like
// Set the capacity so that the caller appending to enc does not corrupt later keys.


pkg/sql/rowenc/index_encoding.go, line 850 at r3 (raw file):

}

func encodeGeoKeys(inKey []byte, geoKeys []geoindex.Key) (keys [][]byte, err error) {

looking at this again, just to completely satisfy my curiosity
Whenever b is not large enough it will need to be reallocated, so say it grows as m, 2m, 4m, 8m, 16m, where m is the length added in each loop iteration. Each time it will incur a copy, so m + 2m + 4m + 8m =15m bytes will be copied by the time it grows to 16m. So aside from the copying that is performed in the append(b, inKey...), with p loop iterations and total final length n this will incur n bytes copied and log p allocations. Correct? The old code also incurred n bytes copied since it did not allocate enough in the outKey := make([]byte, len(inKey)), and incurred p allocations. We could have changed the old code to incur no copies by over-allocating based on the worst case encoded length of a DInt (would break abstraction a bit).

I assume your experience is that the copying cost of the pattern you have changed this to, to not be much of a detractor, yes?


pkg/sql/rowenc/index_encoding.go, line 856 at r3 (raw file):

		prev := len(b)
		b = append(b, inKey...)
		b = encoding.EncodeUvarintAscending(b, uint64(k))

can you add a comment like
// Set the capacity so that the caller appending to a key in keys does not corrupt later keys.

craig bot pushed a commit that referenced this pull request Aug 24, 2020
52972: cli: command to deserialize descriptors r=spaskob a=spaskob

We currently export the descriptors and job payloads in debug zips with
hex encoding.  It is often very valuable to decode these protos into
JSON for inspection.

Fixes #52063.

Release note (cli change):
The new debug command `decode-proto` reads descriptor from stdin in hex
or base64 format (auto-detected) and a flag --schema=\<fully qualified name to decode\>
with default value `cockroach.sql.sqlbase.Descriptor` and outputs to stdout the
deserialized proto in JSON format.

53226: colexec: make hash table more dynamic and fix hash join in some cases r=yuzefovich a=yuzefovich

**colexec: make hash table more dynamic**

This commit replaces all fixed allocations of constant size in the hash
table in favor of allocating the slices dynamically which makes the hash
table more dynamic as well. This allows us to place the logic of
clearing up those slices for the next iteration in a single place which
simplifies the code a bit.

Additionally it moves `buckets` slice that is only used by the hash
joiner out of the hash table.

Release note: None

**colexec: fix LEFT ANTI hash join when right eq cols are key**

Release note (bug fix): Previously, CockroachDB could return incorrect
results when performing LEFT ANTI hash join when right equality columns
form a key when it was executed via the vectorized engine, and now this
has been fixed.

**colexec: fix set-op hash joins and optimize them a bit**

This commit fixes several problems with set-operation hash joins:
1. similar to how LEFT ANTI hash joins are handled, INTERSECT ALL and
EXCEPT ALL hash joins rely on `headID` to be populated. That step is
skipped if the right side is distinct (meaning right equality columns
form a key). This would result in incorrect output, and we now override
`rightDistinct` flag to get the desired behavior (probably sub-optimal
but correct).
2. actually, before we would get an incorrect result, we would hit an
out of bounds error because `hashTable.visited` would not have been
created previously. That slice is used by set-op joins to track which
tuples from the right side have already been "deleted" (i.e. they were
used for a match). This is now also fixed.

However, currently the information whether the right equality columns
form a key is not propagated for set-operation joins, so the bug would
not have occurred in the non-test environment.

Additionally, this commit optimizes the handling of LEFT ANTI and EXCEPT
ALL joins by not allocating `same` slice because those two join types
have separate `collectAnti*` methods that don't use that slice.

Release note: None

53248: release: fix adding libgeos files to .tar.gz and compress .zip r=jlinder a=otan

* Fix issue where only the first file was added to a .tar.gz.
* Compress files better when adding to zips.

Release note: None

53252: sql/kv: optimize heap allocations for inverted joins and scans with many spans r=nvanbenschoten a=nvanbenschoten

This PR contains a collection of optimizations that build on #53181,
#53215, and #53237 to reduce the number and size of heap allocations
performed by inverted joins.

Combined, these changes results in a **1.46%** speedup on the following
geospatial query:
```sql
SELECT Count(census.blkid), Count(subways.name)
FROM nyc_census_blocks AS census
JOIN nyc_subway_stations AS subways
ON ST_Intersects(subways.geom, census.geom);
```
```
name                                old ms    new ms    delta
Test/postgis_geometry_tutorial/nyc  144 ±15%  142 ±16%  -1.46%  (p=0.000 n=972+974)
```

----

### kv: reserve lock spans in SpanSet ahead of time

In #52610, we merged the latch and lock span declared by requests. Doing
so is the long term goal in this area, but it doesn't seem like that's
going to be possible for v20.2. Meanwhile, the collection of lock spans
is currently less efficient than the collection of latch spans. This is
because we have a heuristic to eagerly allocate lock span slices ahead
of time, instead of letting them grow as they are appended to. This
optimization does not exist for lock spans.

We can see this by looking at an alloc_space heap profile while running
a geospatial query of interest. We see that the lock spans account for
**9.76%** of all allocated space while the latch spans account for only
**2.65%** of all allocated space.

```
----------------------------------------------------------+-------------
                                          646.37MB   100% |   github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval.DefaultDeclareIsolatedKeys /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval/declare.go:90
         0     0%   100%   646.37MB  9.76%                | github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset.(*SpanSet).AddNonMVCC /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset/spanset.go:127
                                          646.37MB   100% |   github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset.(*SpanSet).AddMVCC /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset/spanset.go:139
----------------------------------------------------------+-------------
                                          175.37MB   100% |   github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).collectSpans /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/replica_send.go:691
  175.37MB  2.65% 61.51%   175.37MB  2.65%                | github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset.(*SpanSet).Reserve /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset/spanset.go:120
----------------------------------------------------------+-------------
```

To improve this, we can do the same ahead of time slice resizing for
lock spans by reserving an estimate for the number and type of lock
spans we expect a request to declare. This guess may be a little less
accurate than our guess for latch spans, because there are a few request
types that declare latch spans but not lock spans, but these are rare
requests that are much less important to optimize than the requests that
do declare a latch and a lock span.

With this change, we shave off about **6.57%** of total allocated space
in this workload. We no longer see allocations in `SpanSet.AddNonMVCC`
and the amount of heap memory allocated by `SpanSet.Reserve` roughly
doubles, as we would expect.

```
----------------------------------------------------------+-------------
                                          190.99MB 53.34% |   github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).collectSpans /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/replica_send.go:692
                                          167.09MB 46.66% |   github.com/cockroachdb/cockroach/pkg/kv/kvserver.(*Replica).collectSpans /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/replica_send.go:693
  358.08MB  5.84% 36.53%   358.08MB  5.84%                | github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset.(*SpanSet).Reserve /go/src/github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset/spanset.go:120
----------------------------------------------------------+-------------
```

### sql/row: batch allocate RequestUnion wrapper structs in txnKVFetcher.fetch

A Request's approach towards representing a union type results in double
indirection. For instance, a ScanRequest is represented like:
```
Request{Value: &RequestUnion_Scan{Scan: &ScanRequest{}}}
```

The code in txnKVFetcher.fetch was batch allocating the inner struct in
this structure across all scans in a batch, but was not batch allocating
the outer "union" struct. The effect of this is that the outer struct's
heap allocation showed up as 4.78% of the alloc_objects heap profile of
an interesting geospatial query:

```
----------------------------------------------------------+-------------
                                           2621460   100% |   github.com/cockroachdb/cockroach/pkg/sql/row.(*txnKVFetcher).fetch /go/src/github.com/cockroachdb/cockroach/pkg/sql/row/kv_batch_fetcher.go:297
         0     0%   100%    2621460  4.78%                | github.com/cockroachdb/cockroach/pkg/roachpb.(*RequestUnion).MustSetInner /go/src/github.com/cockroachdb/cockroach/pkg/roachpb/api.go:604
                                           2621460   100% |   github.com/cockroachdb/cockroach/pkg/roachpb.(*RequestUnion).SetInner /go/src/github.com/cockroachdb/cockroach/pkg/roachpb/batch_generated.go:354
----------------------------------------------------------+-------------
```

This commit avoids this source of per-request allocations by batch
allocating these union structs across all requests in a batch just
like the code was previously batch allocating the inner structs.

### sql/rowexec: avoid allocating slice header on sort in fragmentPendingSpans

batchedInvertedExprEvaluator's fragmentPendingSpans performs a
`sort.Slice` to sort a list of pending spans by their end key.
`sort.Slice` is convenient, but it has some trade-offs. For one, it uses
reflection to mimic parameterization over all slice types. Second, it
accepts the slice as an interface, which causes the slice header to
escape to the heap.

The effect of this is that the call was showing up as 4.35% of the
alloc_objects heap profile of an interesting geospatial query:

```
----------------------------------------------------------+-------------
                                           2387773   100% |   github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*batchedInvertedExprEvaluator).init /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_expr_evaluator.go:406
   2162754  3.94% 40.58%    2387773  4.35%                | github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*batchedInvertedExprEvaluator).fragmentPendingSpans /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/inverted_expr_evaluator.go:284
                                            225019  9.42% |   sort.Slice /usr/local/go/src/sort/slice.go:15
----------------------------------------------------------+-------------
```

This commit avoids this heap allocation by assigning the slice to a
field on the batchedInvertedExprEvaluator (which was already on the
heap) and using the old `sort.Sort` function.

----

@otan an area I didn't touch but wanted to talk to you about was the
memory representation of `DGeography` and `DGeometry`. Specifically, I'm
wondering why we're embedding pointers instead of values inside of these
datums, effectively creating a double-boxing situation. I see that most
of the geo operators work off pointer and that the "canonical"
representation of the objects exported by `pkg/geo` is a pointer (e.g.
the package exports "New..." methods instead of "Make..." methods). I
don't understand why this is though. Both `Geography` and `Geometry` are
only 40 bytes large, which is well within reason to pass on the stack.

I ask because this makes working with these datums more expensive than
strictly necessary. We make an effort to batch allocate datum objects
themselves (see `DatumAlloc`), but don't do anything special with
internal heap allocations. I suspect that there would be a moderate win
here if we eliminated the double-boxing on any query that scans a large
number of rows containing one of these data types, as I'm seeing the
heap allocations in DecodeUntaggedDatum fairly prominently in profiles.

Co-authored-by: Spas Bojanov <spas@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Oliver Tan <otan@cockroachlabs.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Copy link
Copy Markdown
Contributor Author

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)


pkg/sql/opt/invertedexpr/geo_expression.go, line 44 at r3 (raw file):

Previously, sumeerbhola wrote…

can you add a comment like
// Set the capacity so that the caller appending to enc does not corrupt later keys.

Done.


pkg/sql/rowenc/index_encoding.go, line 850 at r3 (raw file):
That's a very interesting point. My experience is that the copying cost is far outweighed by the allocator cost in these kinds of situations.

You could also make an argument that this approach is over-allocating by about 100%, in addition to the copying cost. But I think the previous approach would also average out to over-allocating by 100%. So I think that's actually a wash.

We could have changed the old code to incur no copies by over-allocating based on the worst case encoded length of a DInt (would break abstraction a bit).

The worse-case encoded length of a DInt is 9 bytes. Do you have an idea of the distribution of values for geoindex.Key? If the distribution is uniform then we'd expect to use 9 bytes 50% of the time, 8 bytes 25% of the time, etc. That would mean that over-allocating upfront would almost certainly be worth it.

The same argument could be made for geoKeyToEncInvertedVal and its callers.


pkg/sql/rowenc/index_encoding.go, line 856 at r3 (raw file):

Previously, sumeerbhola wrote…

can you add a comment like
// Set the capacity so that the caller appending to a key in keys does not corrupt later keys.

Done.

…oSpanExpr

This commit optimizes a few heap allocations in `GeoUnionKeySpansToSpanExpr` and
in `GeoRPKeyExprToSpanExpr` to avoid per-span/per-expression heap allocations.

Before this change, these heap allocations accounted for 27.52% of all heap
allocations (by object, `alloc_objects`) in a geospatial query of interest:

```
Type: alloc_objects
Time: Aug 21, 2020 at 4:59pm (UTC)
Active filters:
   focus=geoKeyToEncInvertedVal
Showing nodes accounting for 61277094, 27.52% of 222698492 total
----------------------------------------------------------+-------------
      flat  flat%   sum%        cum   cum%   calls calls% + context
----------------------------------------------------------+-------------
                                          15270121 99.79% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:46
                                             32768  0.21% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:45
  15302889  6.87%  6.87%   15302889  6.87%                | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:31
----------------------------------------------------------+-------------
                                          16187639 53.52% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:46
                                          14057686 46.48% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:45
         0     0%  6.87%   30245325 13.58%                | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:32
                                          30245325   100% |   github.com/cockroachdb/cockroach/pkg/sql/sqlbase.EncodeTableKey /go/src/github.com/cockroachdb/cockroach/pkg/sql/sqlbase/column_type_encoding.go:75
----------------------------------------------------------+-------------
                                          15728880   100% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:46
         0     0%  6.87%   15728880  7.06%                | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:38
                                          15728880   100% |   github.com/cockroachdb/cockroach/pkg/roachpb.Key.PrefixEnd /go/src/github.com/cockroachdb/cockroach/pkg/roachpb/data.go:188
----------------------------------------------------------+-------------
```

These allocations were largely avoidable. This commit removes some of
them entirely and reworks the code structure in order to amortize the
unavoidable allocations over the set of keys being encoded. The result
is a large reduction in allocations. On the same workload, allocations
under `geoKeyToEncInvertedVal` dropped from **27.52%** of the workload
to **2.23%** of the workload:

```
Type: alloc_objects
Time: Aug 21, 2020 at 5:04pm (UTC)
Active filters:
   focus=geoKeyToEncInvertedVal
Showing nodes accounting for 3632555, 2.23% of 162651628 total
----------------------------------------------------------+-------------
      flat  flat%   sum%        cum   cum%   calls calls% + context
----------------------------------------------------------+-------------
                                           2014518 55.46% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:53
                                           1618037 44.54% |   github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoToSpan /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:54
         0     0%     0%    3632555  2.23%                | github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr.geoKeyToEncInvertedVal /go/src/github.com/cockroachdb/cockroach/pkg/sql/opt/invertedexpr/geo_expression.go:44
                                           3632555   100% |   github.com/cockroachdb/cockroach/pkg/util/encoding.EncodeUvarintAscending /go/src/github.com/cockroachdb/cockroach/pkg/util/encoding/encoding.go:375
----------------------------------------------------------+-------------
```

When applied on top of cockroachdb#53181, this results in a **1.71%** speedup on
the following geospatial query:

```sql
SELECT Count(census.blkid), Count(subways.name)
FROM nyc_census_blocks AS census
JOIN nyc_subway_stations AS subways
ON ST_Intersects(subways.geom, census.geom);
```

```
name                                old ms    new ms    delta
Test/postgis_geometry_tutorial/nyc  136 ±12%  134 ±11%  -1.71%  (p=0.000 n=976+977)
```

In making this change, we also fix a bug where `geoindex.Key` (a uint64) was
being cast to a `tree.DInt` (an int64) during encoding. This could result in
overflow and could cause inverted key spans to be generated.
Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @sumeerbhola)


pkg/sql/rowenc/index_encoding.go, line 850 at r3 (raw file):

If the distribution is uniform then we'd expect to use 9 bytes 50% of the time, 8 bytes 25% of the time, etc. That would mean that over-allocating upfront would almost certainly be worth it.

The distribution should be uniform. Geometry (not geography) will have 00 as the leading 2 bits, but that is still close enough.

…panExpr

The distribution of `geoindex.Key` values is expected to be uniform,
which means that the number of bytes used for the varint encoding is
expected to be at or near the maximum length with high probability.

This commit capitalizes on this to create reasonable estimates for
the total buffer size needed in encodeGeoKeys and GeoUnionKeySpansToSpanExpr
to avoid needing to resize the slices in these functions.
@nvb nvb force-pushed the nvanbenschoten/geoKeyToEncInvertedValOpt branch from f53a295 to 6ae5c97 Compare August 24, 2020 19:47
Copy link
Copy Markdown
Contributor Author

@nvb nvb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @sumeerbhola)


pkg/sql/rowenc/index_encoding.go, line 850 at r3 (raw file):

Previously, sumeerbhola wrote…

If the distribution is uniform then we'd expect to use 9 bytes 50% of the time, 8 bytes 25% of the time, etc. That would mean that over-allocating upfront would almost certainly be worth it.

The distribution should be uniform. Geometry (not geography) will have 00 as the leading 2 bits, but that is still close enough.

Done (in a second commit). This should avoid the need to ever resize slices encodeGeoKeys or GeoUnionKeySpansToSpanExpr.

I don't have a benchmark environment set up right now. Are you ok merging this as is or would you like me to get numbers on this change first?

Copy link
Copy Markdown
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 6 of 6 files at r5.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten and @sumeerbhola)


pkg/sql/rowenc/index_encoding.go, line 850 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Done (in a second commit). This should avoid the need to ever resize slices encodeGeoKeys or GeoUnionKeySpansToSpanExpr.

I don't have a benchmark environment set up right now. Are you ok merging this as is or would you like me to get numbers on this change first?

Merging is fine.

@nvb
Copy link
Copy Markdown
Contributor Author

nvb commented Aug 24, 2020

TFTR!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 24, 2020

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 25, 2020

Build succeeded:

@craig craig bot merged commit 72cb52d into cockroachdb:master Aug 25, 2020
@nvb nvb deleted the nvanbenschoten/geoKeyToEncInvertedValOpt branch August 29, 2020 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants