Skip to content

sql: support geospatial index for index recommendation indexrec#82293

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
wenyihu6:whu-indexrec-geospatial
Jun 8, 2022
Merged

sql: support geospatial index for index recommendation indexrec#82293
craig[bot] merged 1 commit intocockroachdb:masterfrom
wenyihu6:whu-indexrec-geospatial

Conversation

@wenyihu6
Copy link
Copy Markdown
Contributor

@wenyihu6 wenyihu6 commented Jun 1, 2022

Previously, we did not consider geospatial inverted index or box2d index for
index recommendation. But for queries that use index-accelerated geospatial
functions or box2d operations, the spatial index should be recommended. This PR
added spatial index to index candidates.

Fixes: #80934

Release note (sql change): index recommendations are now supported for spatial indexes

@wenyihu6 wenyihu6 added the T-sql-queries SQL Queries Team label Jun 1, 2022
@wenyihu6 wenyihu6 requested a review from michae2 June 1, 2022 16:41
@wenyihu6 wenyihu6 requested a review from a team as a code owner June 1, 2022 16:41
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@wenyihu6 wenyihu6 removed the request for review from a team June 1, 2022 16:55
Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! This is looking great! I love all the testcases! 💯 🚀

When we're getting close to done with this PR, I'll ask you to squash all of the commits into a single git commit. (You can do that now or later, your choice.)

Reviewed 1 of 2 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @wenyihu6)


pkg/sql/opt/indexrec/index_candidate_set.go line 384 at r4 (raw file):

	}

	// Note: Indexes on spatial columns are now added for index recommendation.

nit: I think you can remove this comment. (But I'm glad it is true! 🙂)


pkg/sql/opt/indexrec/index_candidate_set.go line 410 at r4 (raw file):

func (ics *indexCandidateSet) addGeoSpatialIndexes(expr opt.Expr, indexCandidates map[cat.Table][][]cat.IndexColumn) {
	switch expr := expr.(type) {
	case *memo.FunctionExpr:

From looking at GetGeoIndexRelationship in pkg/sql/opt/invertedidx/geo.go:49 it looks like there are two other expression types that can also be index-accelerated by a geospatial inverted index: *memo.BBoxCoversExpr and *memo.BBoxIntersectsExpr. Let's add those expression types to this function as well as categorizeIndexCandidates. (Sorry I didn't notice this sooner!)

@michae2 michae2 requested review from a team and mgartner June 1, 2022 17:18
@michae2 michae2 added the first-pr Use to mark the first PR sent by a contributor / team member. Reviewers should be mindful of this. label Jun 1, 2022
@wenyihu6 wenyihu6 closed this Jun 1, 2022
@wenyihu6 wenyihu6 reopened this Jun 1, 2022
@wenyihu6 wenyihu6 force-pushed the whu-indexrec-geospatial branch from 52d8d9c to 60a52d7 Compare June 1, 2022 21:09
@wenyihu6 wenyihu6 closed this Jun 2, 2022
@wenyihu6 wenyihu6 reopened this Jun 2, 2022
@wenyihu6 wenyihu6 force-pushed the whu-indexrec-geospatial branch 2 times, most recently from c4ad290 to d65fccf Compare June 2, 2022 07:50
@wenyihu6 wenyihu6 closed this Jun 2, 2022
@wenyihu6 wenyihu6 reopened this Jun 2, 2022
@wenyihu6 wenyihu6 force-pushed the whu-indexrec-geospatial branch from 624ccd3 to 2b07bb4 Compare June 2, 2022 21:38
@wenyihu6 wenyihu6 requested a review from michae2 June 2, 2022 21:50
Copy link
Copy Markdown
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! I agree that the test cases are great! Just a few nits below in addition to @michae2's comments

Reviewed 2 of 2 files at r5, 2 of 2 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @wenyihu6)


-- commits line 8 at r5:
nit: change the commit message to look more like the PR message

nit: add a release note, e.g., Release note (sql change): index recommendations are now supported for spatial inverted indexes.


pkg/sql/opt/indexrec/index_candidate_set.go line 415 at r5 (raw file):

// AddGeoSpatialIndexes is used to add single-column GIN indexes to inverted candidates for
// spatial functions that can be index-accelerated.

nit: comments should be <= 80 columns wide

nit: we tend to use the word "inverted" instead of "GIN" throughout the codebase. GIN is only supported for compatibility with Postgres.

nit: this function isn't specifically adding to invertedCandidates -- it adds to whatever indexCandidates map was passed in.


pkg/sql/opt/indexrec/index_candidate_set.go line 423 at r5 (raw file):

		if ok {
			// Add arguments of the spatial function to inverted indexes
			for i := range expr.Args {

nit: since you're using the Child method below, I think it would be better to also use ChildCount here. For example:
for i, n := 0, expr.Args.ChildCount(); i < n; i++ {


pkg/sql/opt/indexrec/index_candidate_set.go line 417 at r7 (raw file):

	switch expr := expr.(type) {
	case *memo.FunctionExpr:
		// Ensure that the function is a spatial function AND can be index-accelerated

nit: end comments with a period. (here and in other comments below)

@wenyihu6 wenyihu6 force-pushed the whu-indexrec-geospatial branch 3 times, most recently from 9e7986b to c72ac5d Compare June 2, 2022 22:36
@wenyihu6 wenyihu6 requested a review from rytaft June 2, 2022 22:36
@wenyihu6 wenyihu6 force-pushed the whu-indexrec-geospatial branch from c72ac5d to becd458 Compare June 2, 2022 22:38
Copy link
Copy Markdown
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: once you address @michae2's feedback and update the commit message. Nice work!

Reviewed 1 of 1 files at r10, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @wenyihu6)


-- commits line 4 at r10:
You can just make the commit message match the PR message exactly. I think the docs team also relies on having the release note included in the commit message

@wenyihu6 wenyihu6 changed the title sql: added geospatial inverted index for spatial functions to indexrec sql: support geospatial and box2d index for index recommendation indexrec Jun 2, 2022
@wenyihu6 wenyihu6 force-pushed the whu-indexrec-geospatial branch from becd458 to e4b3ba0 Compare June 2, 2022 22:56
Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great.

I tried playing with it a bit, and couldn't get any of the examples from the test to actually show a recommendation. If you add a few index-recommendations testcases to match the index-candidates testcases, do any of them show recommendations? 🤔

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @wenyihu6)


-- commits line 11 at r11:
nit: I think you can leave off "and box2d indexes".


pkg/sql/opt/indexrec/index_candidate_set.go line 413 at r11 (raw file):

// AddGeoSpatialIndexes is used to add single-column indexes to indexCandidates
// for spatial functions that can be index-accelerated.
func (ics *indexCandidateSet) addGeoSpatialIndexes(expr opt.Expr, indexCandidates map[cat.Table][][]cat.IndexColumn) {

Looks like a linter wants you to wrap this function signature. Here's the output from TeamCity:

------- Stdout: -------
=== RUN   TestLint/TestCrlfmt
=== PAUSE TestLint/TestCrlfmt
=== CONT  TestLint/TestCrlfmt
=== CONT  TestLint/TestCrlfmt
    lint_test.go:1372:
        diff -u old/sql/opt/indexrec/index_candidate_set.go new/sql/opt/indexrec/index_candidate_set.go
        --- old/sql/opt/indexrec/index_candidate_set.go  2022-06-02 23:22:26.562030586 +0000
        +++ new/sql/opt/indexrec/index_candidate_set.go  2022-06-02 23:22:26.562030586 +0000
        @@ -410,7 +410,9 @@
         // AddGeoSpatialIndexes is used to add single-column indexes to indexCandidates
         // for spatial functions that can be index-accelerated.
        -func (ics *indexCandidateSet) addGeoSpatialIndexes(expr opt.Expr, indexCandidates map[cat.Table][][]cat.IndexColumn) {
        +func (ics *indexCandidateSet) addGeoSpatialIndexes(
        +  expr opt.Expr, indexCandidates map[cat.Table][][]cat.IndexColumn,
        +) {
           switch expr := expr.(type) {
           case *memo.FunctionExpr:
             // Ensure that the function is a spatial function AND can be index-accelerated.
    lint_test.go:1387: run the following to fix your formatting:
        bin/crlfmt "-fast" "-ignore" "\\.(pb(\\.gw)?)|(\\.[eo]g)\\.go|/testdata/|^sql/parser/sql\\.go$|_generated(_test)?\\.go$" "-tab" "2" "-w" "/go/src/github.com/cockroachdb/cockroach/pkg"
        Don't forget to add amend the result to the correct commits.
    --- FAIL: TestLint/TestCrlfmt (6.48s)

@wenyihu6 wenyihu6 closed this Jun 6, 2022
@wenyihu6 wenyihu6 reopened this Jun 6, 2022
Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @wenyihu6)


pkg/sql/opt/indexrec/hypothetical_index.go line 209 at r12 (raw file):

// GeoConfig is part of the cat.Index interface.
// TODO(nehageorge): Add support for spatial index recommendations.

Nice catch!

nit: You can remove this TODO comment. 🙂


pkg/sql/opt/indexrec/hypothetical_table.go line 181 at r12 (raw file):

		ht.ColumnCount(),
		tree.Name(string(invertedSourceCol.ColName())+"_inverted_key"),
		invertedSourceCol.DatumType(),

From looking at InvertedColumnKeyType in pkg/sql/catalog/descpb/index.go it seems like this should be types.EncodedKey rather than types.Bytes.

@wenyihu6 wenyihu6 force-pushed the whu-indexrec-geospatial branch from dd59837 to 98d8a90 Compare June 7, 2022 13:53
Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Great job!

Reviewed 1 of 2 files at r12, 3 of 3 files at r13, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2 and @wenyihu6)


-- commits line 11 at r11:

Previously, michae2 (Michael Erickson) wrote…

nit: I think you can leave off "and box2d indexes".

Also, we aim to wrap git commit messages at 72 characters: https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/73072807/Git+Commit+Messages#GitCommitMessages-Commitdescription


-- commits line 15 at r14:
nit: remove these extra titles from the commit message - they aren't necessary.


pkg/sql/opt/indexrec/index_candidate_set.go line 413 at r14 (raw file):

// AddGeoSpatialIndexes is used to add single-column indexes to indexCandidates
// for spatial functions that can be index-accelerated.
func (ics *indexCandidateSet) addGeoSpatialIndexes(expr opt.Expr, indexCandidates map[cat.Table][][]cat.IndexColumn) {

The caller of this function has already asserted the type of the expr as *memo.FunctionExpr, so you can change the type of expr in the function signature to *memo.FunctionExpr. Then you can remove the switch statement inside this function.


pkg/sql/opt/indexrec/testdata/geospatial-index-candidates-recommendations line 804 at r14 (raw file):

index recommendations: 1
1. type: index creation
   SQL command: CREATE INDEX ON t2 (bbox1) STORING (k, i, s, geom1, geog1, bbox2, geom2, geog2, inet1);

Maybe you should add a comment above this test to explain that the recommendation is a non-inverted index because an inverted index on geom1 cannot be used because the left-hand side of the ~ operator is not a constant?


pkg/sql/opt/indexrec/testdata/geospatial-index-candidates-recommendations line 961 at r14 (raw file):

      └── inet1:10 && inet1:10 [outer=(10), immutable]

# Part 3: For spatial functions that do not use index accelerated geospatial functions, do not recommend geospatial index.

nit: wrap comments to 80 characters


pkg/sql/opt/indexrec/testdata/geospatial-index-candidates-recommendations line 1081 at r14 (raw file):


index-recommendations
SELECT * FROM t2 WHERE st_overlaps(geom1, geom2) AND k = 2

If you use a constant for one argument of st_overlaps you should get a more interesting recommendation like, CREATE INVERTED INDEX ON t2 (k, geom1) ..., which I think is what you are trying to test here and in the test casts below.

@wenyihu6 wenyihu6 force-pushed the whu-indexrec-geospatial branch from 98d8a90 to 733b357 Compare June 7, 2022 14:27
@wenyihu6 wenyihu6 changed the title sql: support geospatial and box2d index for index recommendation indexrec sql: support geospatial index for index recommendation indexrec Jun 7, 2022
@wenyihu6 wenyihu6 force-pushed the whu-indexrec-geospatial branch 3 times, most recently from 1d56bbc to 5c2da11 Compare June 7, 2022 15:39
Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong: Nice work! 🎉

I have one question below, but it doesn't need to block merging. This looks good!

Reviewed 2 of 3 files at r13, 2 of 2 files at r15, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @wenyihu6)


pkg/sql/opt/indexrec/hypothetical_table.go line 181 at r12 (raw file):

Previously, michae2 (Michael Erickson) wrote…

From looking at InvertedColumnKeyType in pkg/sql/catalog/descpb/index.go it seems like this should be types.EncodedKey rather than types.Bytes.

@wenyihu6 I'm a little confused, could you explain why this change was necessary? It looks like in the test catalog we use the column type but in the real catalog we always use types.EncodedKey. Without knowing more, I'd be inclined to use types.EncodedKey. But maybe I just don't understand what is going on here.

@mgartner
Copy link
Copy Markdown
Contributor

mgartner commented Jun 7, 2022

pkg/sql/opt/indexrec/hypothetical_table.go line 181 at r12 (raw file):

Previously, michae2 (Michael Erickson) wrote…

@wenyihu6 I'm a little confused, could you explain why this change was necessary? It looks like in the test catalog we use the column type but in the real catalog we always use types.EncodedKey. Without knowing more, I'd be inclined to use types.EncodedKey. But maybe I just don't understand what is going on here.

I think the test catalog is wrong, though I'm curious why that hasn't caused issues.

type.EncodedKey is a recent addition to better describe the type of inverted columns. We previously used type.Bytes, but that was less than ideal. There's some more context here: #77501

I think in this case we should use types.EncodedKey.

Copy link
Copy Markdown
Contributor Author

@wenyihu6 wenyihu6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong: Nice work! 🎉

I have one question below, but it doesn't need to block merging. This looks good!

Reviewed 2 of 3 files at r13, 2 of 2 files at r15, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @wenyihu6)

pkg/sql/opt/indexrec/hypothetical_table.go line 181 at r12 (raw file):

Previously, michae2 (Michael Erickson) wrote…
@wenyihu6 I'm a little confused, could you explain why this change was necessary? It looks like in the test catalog we use the column type but in the real catalog we always use types.EncodedKey. Without knowing more, I'd be inclined to use types.EncodedKey. But maybe I just don't understand what is going on here.

I was reading the code in pkg/sql/opt/testutils/testcat/create_table.go, and it was using the data type of the inverted column instead. If inverted column should be types.EncodedKey, I will also need to add another statement in GeoConfig to find the type of the table column referenced by the inverted index.

@mgartner
Copy link
Copy Markdown
Contributor

mgartner commented Jun 7, 2022

pkg/sql/opt/indexrec/hypothetical_table.go line 181 at r12 (raw file):
In the opt catalog (the real catalog), we set the inverted column type in an index to type.EncodedKey:

invertedColumnType,

// InvertedColumnKeyType returns the type of the data element that is encoded
// as the inverted index key. This is currently always EncodedKey.
//
// Panics if the index is not inverted.
InvertedColumnKeyType() *types.T

So that confirms that it should be EncodedKey here.

I will need to add another statement in GeoConfig to find the type of the column on the table referenced by the inverted index.

To get the type of the inverted column's source column, you can do something like this:

	srcCol := hi.tab.Column(hi.InvertedColumn().InvertedSourceColumnOrdinal())
	srcCol.DatumType()

@wenyihu6 wenyihu6 force-pushed the whu-indexrec-geospatial branch from 5c2da11 to 8ce9c32 Compare June 7, 2022 20:40
@mgartner
Copy link
Copy Markdown
Contributor

mgartner commented Jun 7, 2022

pkg/sql/opt/indexrec/hypothetical_table.go line 181 at r12 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

In the opt catalog (the real catalog), we set the inverted column type in an index to type.EncodedKey:

invertedColumnType,

// InvertedColumnKeyType returns the type of the data element that is encoded
// as the inverted index key. This is currently always EncodedKey.
//
// Panics if the index is not inverted.
InvertedColumnKeyType() *types.T

So that confirms that it should be EncodedKey here.

I will need to add another statement in GeoConfig to find the type of the column on the table referenced by the inverted index.

To get the type of the inverted column's source column, you can do something like this:

	srcCol := hi.tab.Column(hi.InvertedColumn().InvertedSourceColumnOrdinal())
	srcCol.DatumType()

FYI I fixed the problem in the test catalog in a separate PR: #82547

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r15, 2 of 2 files at r16, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @michae2 and @wenyihu6)


pkg/sql/opt/indexrec/testdata/geospatial-index-candidates-recommendations line 236 at r16 (raw file):

 (geom2)

# Note: geom1 or geom2 is not chosen for the query plan. When both arguments of

nit: An index on geom1 or geom2...


pkg/sql/opt/indexrec/testdata/geospatial-index-candidates-recommendations line 802 at r16 (raw file):

# Part 2: Geospatial indexes should be added for Box2d operators and functions.

# Note: geom1 is not chosen for the query plan. When both arguments of functions

nit: An index on geom1...


pkg/sql/opt/indexrec/testdata/geospatial-index-candidates-recommendations line 832 at r16 (raw file):

      └── bbox1:6 ~ geom1:4 [outer=(4,6), immutable, constraints=(/4: (/NULL - ]; /6: (/NULL - ])]

# Note: geom1 is chosen for the query plan here. If one of the arguments is a

nit: An index on geom1...

@wenyihu6 wenyihu6 force-pushed the whu-indexrec-geospatial branch from 8ce9c32 to c33c76b Compare June 7, 2022 21:10
Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r17, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @michae2 and @wenyihu6)

Previously, we did not consider geospatial inverted index or box2d index for
index recommendation. But for queries that use index-accelerated geospatial
functions or box2d operations, the spatial index should be recommended. This PR
added spatial index to index candidates and index recommendation engine.

Fixes: cockroachdb#80934

Release note (sql change): index recommendations are now supported for spatial indexes
@wenyihu6 wenyihu6 force-pushed the whu-indexrec-geospatial branch from c33c76b to e1fd17b Compare June 8, 2022 19:55
@cockroachdb cockroachdb deleted a comment from craig bot Jun 8, 2022
@wenyihu6
Copy link
Copy Markdown
Contributor Author

wenyihu6 commented Jun 8, 2022

bors r+

@craig craig bot merged commit 7d365bc into cockroachdb:master Jun 8, 2022
@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jun 8, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

first-pr Use to mark the first PR sent by a contributor / team member. Reviewers should be mindful of this. T-sql-queries SQL Queries Team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

opt: add geospatial inverted index recommendations

5 participants