sql: support geospatial index for index recommendation indexrec#82293
sql: support geospatial index for index recommendation indexrec#82293craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
michae2
left a comment
There was a problem hiding this comment.
Nice job! This is looking great! I love all the testcases! 💯 🚀
When we're getting close to done with this PR, I'll ask you to squash all of the commits into a single git commit. (You can do that now or later, your choice.)
Reviewed 1 of 2 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @wenyihu6)
pkg/sql/opt/indexrec/index_candidate_set.go line 384 at r4 (raw file):
} // Note: Indexes on spatial columns are now added for index recommendation.
nit: I think you can remove this comment. (But I'm glad it is true! 🙂)
pkg/sql/opt/indexrec/index_candidate_set.go line 410 at r4 (raw file):
func (ics *indexCandidateSet) addGeoSpatialIndexes(expr opt.Expr, indexCandidates map[cat.Table][][]cat.IndexColumn) { switch expr := expr.(type) { case *memo.FunctionExpr:
From looking at GetGeoIndexRelationship in pkg/sql/opt/invertedidx/geo.go:49 it looks like there are two other expression types that can also be index-accelerated by a geospatial inverted index: *memo.BBoxCoversExpr and *memo.BBoxIntersectsExpr. Let's add those expression types to this function as well as categorizeIndexCandidates. (Sorry I didn't notice this sooner!)
52d8d9c to
60a52d7
Compare
c4ad290 to
d65fccf
Compare
624ccd3 to
2b07bb4
Compare
rytaft
left a comment
There was a problem hiding this comment.
Awesome! I agree that the test cases are great! Just a few nits below in addition to @michae2's comments
Reviewed 2 of 2 files at r5, 2 of 2 files at r6, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @wenyihu6)
-- commits line 8 at r5:
nit: change the commit message to look more like the PR message
nit: add a release note, e.g., Release note (sql change): index recommendations are now supported for spatial inverted indexes.
pkg/sql/opt/indexrec/index_candidate_set.go line 415 at r5 (raw file):
// AddGeoSpatialIndexes is used to add single-column GIN indexes to inverted candidates for // spatial functions that can be index-accelerated.
nit: comments should be <= 80 columns wide
nit: we tend to use the word "inverted" instead of "GIN" throughout the codebase. GIN is only supported for compatibility with Postgres.
nit: this function isn't specifically adding to invertedCandidates -- it adds to whatever indexCandidates map was passed in.
pkg/sql/opt/indexrec/index_candidate_set.go line 423 at r5 (raw file):
if ok { // Add arguments of the spatial function to inverted indexes for i := range expr.Args {
nit: since you're using the Child method below, I think it would be better to also use ChildCount here. For example:
for i, n := 0, expr.Args.ChildCount(); i < n; i++ {
pkg/sql/opt/indexrec/index_candidate_set.go line 417 at r7 (raw file):
switch expr := expr.(type) { case *memo.FunctionExpr: // Ensure that the function is a spatial function AND can be index-accelerated
nit: end comments with a period. (here and in other comments below)
9e7986b to
c72ac5d
Compare
c72ac5d to
becd458
Compare
rytaft
left a comment
There was a problem hiding this comment.
once you address @michae2's feedback and update the commit message. Nice work!
Reviewed 1 of 1 files at r10, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @wenyihu6)
-- commits line 4 at r10:
You can just make the commit message match the PR message exactly. I think the docs team also relies on having the release note included in the commit message
becd458 to
e4b3ba0
Compare
michae2
left a comment
There was a problem hiding this comment.
This is looking great.
I tried playing with it a bit, and couldn't get any of the examples from the test to actually show a recommendation. If you add a few index-recommendations testcases to match the index-candidates testcases, do any of them show recommendations? 🤔
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @wenyihu6)
-- commits line 11 at r11:
nit: I think you can leave off "and box2d indexes".
pkg/sql/opt/indexrec/index_candidate_set.go line 413 at r11 (raw file):
// AddGeoSpatialIndexes is used to add single-column indexes to indexCandidates // for spatial functions that can be index-accelerated. func (ics *indexCandidateSet) addGeoSpatialIndexes(expr opt.Expr, indexCandidates map[cat.Table][][]cat.IndexColumn) {
Looks like a linter wants you to wrap this function signature. Here's the output from TeamCity:
------- Stdout: -------
=== RUN TestLint/TestCrlfmt
=== PAUSE TestLint/TestCrlfmt
=== CONT TestLint/TestCrlfmt
=== CONT TestLint/TestCrlfmt
lint_test.go:1372:
diff -u old/sql/opt/indexrec/index_candidate_set.go new/sql/opt/indexrec/index_candidate_set.go
--- old/sql/opt/indexrec/index_candidate_set.go 2022-06-02 23:22:26.562030586 +0000
+++ new/sql/opt/indexrec/index_candidate_set.go 2022-06-02 23:22:26.562030586 +0000
@@ -410,7 +410,9 @@
// AddGeoSpatialIndexes is used to add single-column indexes to indexCandidates
// for spatial functions that can be index-accelerated.
-func (ics *indexCandidateSet) addGeoSpatialIndexes(expr opt.Expr, indexCandidates map[cat.Table][][]cat.IndexColumn) {
+func (ics *indexCandidateSet) addGeoSpatialIndexes(
+ expr opt.Expr, indexCandidates map[cat.Table][][]cat.IndexColumn,
+) {
switch expr := expr.(type) {
case *memo.FunctionExpr:
// Ensure that the function is a spatial function AND can be index-accelerated.
lint_test.go:1387: run the following to fix your formatting:
bin/crlfmt "-fast" "-ignore" "\\.(pb(\\.gw)?)|(\\.[eo]g)\\.go|/testdata/|^sql/parser/sql\\.go$|_generated(_test)?\\.go$" "-tab" "2" "-w" "/go/src/github.com/cockroachdb/cockroach/pkg"
Don't forget to add amend the result to the correct commits.
--- FAIL: TestLint/TestCrlfmt (6.48s)
michae2
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @wenyihu6)
pkg/sql/opt/indexrec/hypothetical_index.go line 209 at r12 (raw file):
// GeoConfig is part of the cat.Index interface. // TODO(nehageorge): Add support for spatial index recommendations.
Nice catch!
nit: You can remove this TODO comment. 🙂
pkg/sql/opt/indexrec/hypothetical_table.go line 181 at r12 (raw file):
ht.ColumnCount(), tree.Name(string(invertedSourceCol.ColName())+"_inverted_key"), invertedSourceCol.DatumType(),
From looking at InvertedColumnKeyType in pkg/sql/catalog/descpb/index.go it seems like this should be types.EncodedKey rather than types.Bytes.
dd59837 to
98d8a90
Compare
mgartner
left a comment
There was a problem hiding this comment.
Reviewed 1 of 2 files at r12, 3 of 3 files at r13, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2 and @wenyihu6)
Previously, michae2 (Michael Erickson) wrote…
nit: I think you can leave off "and box2d indexes".
Also, we aim to wrap git commit messages at 72 characters: https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/73072807/Git+Commit+Messages#GitCommitMessages-Commitdescription
-- commits line 15 at r14:
nit: remove these extra titles from the commit message - they aren't necessary.
pkg/sql/opt/indexrec/index_candidate_set.go line 413 at r14 (raw file):
// AddGeoSpatialIndexes is used to add single-column indexes to indexCandidates // for spatial functions that can be index-accelerated. func (ics *indexCandidateSet) addGeoSpatialIndexes(expr opt.Expr, indexCandidates map[cat.Table][][]cat.IndexColumn) {
The caller of this function has already asserted the type of the expr as *memo.FunctionExpr, so you can change the type of expr in the function signature to *memo.FunctionExpr. Then you can remove the switch statement inside this function.
pkg/sql/opt/indexrec/testdata/geospatial-index-candidates-recommendations line 804 at r14 (raw file):
index recommendations: 1 1. type: index creation SQL command: CREATE INDEX ON t2 (bbox1) STORING (k, i, s, geom1, geog1, bbox2, geom2, geog2, inet1);
Maybe you should add a comment above this test to explain that the recommendation is a non-inverted index because an inverted index on geom1 cannot be used because the left-hand side of the ~ operator is not a constant?
pkg/sql/opt/indexrec/testdata/geospatial-index-candidates-recommendations line 961 at r14 (raw file):
└── inet1:10 && inet1:10 [outer=(10), immutable] # Part 3: For spatial functions that do not use index accelerated geospatial functions, do not recommend geospatial index.
nit: wrap comments to 80 characters
pkg/sql/opt/indexrec/testdata/geospatial-index-candidates-recommendations line 1081 at r14 (raw file):
index-recommendations SELECT * FROM t2 WHERE st_overlaps(geom1, geom2) AND k = 2
If you use a constant for one argument of st_overlaps you should get a more interesting recommendation like, CREATE INVERTED INDEX ON t2 (k, geom1) ..., which I think is what you are trying to test here and in the test casts below.
98d8a90 to
733b357
Compare
1d56bbc to
5c2da11
Compare
michae2
left a comment
There was a problem hiding this comment.
I have one question below, but it doesn't need to block merging. This looks good!
Reviewed 2 of 3 files at r13, 2 of 2 files at r15, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @wenyihu6)
pkg/sql/opt/indexrec/hypothetical_table.go line 181 at r12 (raw file):
Previously, michae2 (Michael Erickson) wrote…
From looking at
InvertedColumnKeyTypeinpkg/sql/catalog/descpb/index.goit seems like this should betypes.EncodedKeyrather thantypes.Bytes.
@wenyihu6 I'm a little confused, could you explain why this change was necessary? It looks like in the test catalog we use the column type but in the real catalog we always use types.EncodedKey. Without knowing more, I'd be inclined to use types.EncodedKey. But maybe I just don't understand what is going on here.
|
Previously, michae2 (Michael Erickson) wrote…
I think the test catalog is wrong, though I'm curious why that hasn't caused issues.
I think in this case we should use |
There was a problem hiding this comment.
I have one question below, but it doesn't need to block merging. This looks good!
Reviewed 2 of 3 files at r13, 2 of 2 files at r15, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @wenyihu6)
pkg/sql/opt/indexrec/hypothetical_table.goline 181 at r12 (raw file):Previously, michae2 (Michael Erickson) wrote…
@wenyihu6 I'm a little confused, could you explain why this change was necessary? It looks like in the test catalog we use the column type but in the real catalog we always usetypes.EncodedKey. Without knowing more, I'd be inclined to usetypes.EncodedKey. But maybe I just don't understand what is going on here.
I was reading the code in pkg/sql/opt/testutils/testcat/create_table.go, and it was using the data type of the inverted column instead. If inverted column should be types.EncodedKey, I will also need to add another statement in GeoConfig to find the type of the table column referenced by the inverted index.
|
cockroach/pkg/sql/opt_catalog.go Line 902 in 32622e1 cockroach/pkg/sql/catalog/table_elements.go Lines 181 to 186 in 49b6ca8 So that confirms that it should be
To get the type of the inverted column's source column, you can do something like this: srcCol := hi.tab.Column(hi.InvertedColumn().InvertedSourceColumnOrdinal())
srcCol.DatumType() |
5c2da11 to
8ce9c32
Compare
|
Previously, mgartner (Marcus Gartner) wrote…
FYI I fixed the problem in the test catalog in a separate PR: #82547 |
mgartner
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r15, 2 of 2 files at r16, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @michae2 and @wenyihu6)
pkg/sql/opt/indexrec/testdata/geospatial-index-candidates-recommendations line 236 at r16 (raw file):
(geom2) # Note: geom1 or geom2 is not chosen for the query plan. When both arguments of
nit: An index on geom1 or geom2...
pkg/sql/opt/indexrec/testdata/geospatial-index-candidates-recommendations line 802 at r16 (raw file):
# Part 2: Geospatial indexes should be added for Box2d operators and functions. # Note: geom1 is not chosen for the query plan. When both arguments of functions
nit: An index on geom1...
pkg/sql/opt/indexrec/testdata/geospatial-index-candidates-recommendations line 832 at r16 (raw file):
└── bbox1:6 ~ geom1:4 [outer=(4,6), immutable, constraints=(/4: (/NULL - ]; /6: (/NULL - ])] # Note: geom1 is chosen for the query plan here. If one of the arguments is a
nit: An index on geom1...
8ce9c32 to
c33c76b
Compare
mgartner
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r17, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @michae2 and @wenyihu6)
Previously, we did not consider geospatial inverted index or box2d index for index recommendation. But for queries that use index-accelerated geospatial functions or box2d operations, the spatial index should be recommended. This PR added spatial index to index candidates and index recommendation engine. Fixes: cockroachdb#80934 Release note (sql change): index recommendations are now supported for spatial indexes
c33c76b to
e1fd17b
Compare
|
bors r+ |
|
Build succeeded: |
Previously, we did not consider geospatial inverted index or box2d index for
index recommendation. But for queries that use index-accelerated geospatial
functions or box2d operations, the spatial index should be recommended. This PR
added spatial index to index candidates.
Fixes: #80934
Release note (sql change): index recommendations are now supported for spatial indexes