Skip to content

sql: permit forward indexes on arrays#91762

Merged
craig[bot] merged 2 commits intocockroachdb:masterfrom
jordanlewis:forward-index-arrays
Nov 19, 2022
Merged

sql: permit forward indexes on arrays#91762
craig[bot] merged 2 commits intocockroachdb:masterfrom
jordanlewis:forward-index-arrays

Conversation

@jordanlewis
Copy link
Copy Markdown
Member

Closes #17154

This functionality has been unlocked since we added trigram indexes - the blocker before was the ability to support both forward and inverted indexes on a given type.

This commit simply permits forward indexes on arrays, where they were previously not supported at all.

Release note (sql change): enable forward indexes on arrays

@jordanlewis jordanlewis requested a review from a team November 11, 2022 18:59
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@jordanlewis jordanlewis marked this pull request as draft November 11, 2022 19:10
@jordanlewis
Copy link
Copy Markdown
Member Author

This needs to take into account enum values in partitioning columns: #17154 (comment)

@ajwerner
Copy link
Copy Markdown
Contributor

One reasonable way to sidestep the array enums would be to disallow partitioning by arrays altogether. It's not clear to me that the partitioning code knows how to think about arrays. We'd want to add testing to cover both the simpler case of primitive arrays and then the more complex case of udts in arrays.

Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool!

Would be nice to have some test cases in pkg/sql/logictest/testdata/logic_test/distsql_stats creating stats on an indexed array column. Maybe a case where there is both a forward and an inverted index on the same array column.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@jordanlewis
Copy link
Copy Markdown
Member Author

jordanlewis commented Nov 11, 2022

TFTR! Added a bunch of tests.

There's a bit of an annoying problem: the index recommendation engine doesn't properly handle types that are both inverted and forward indexable.

In particular, if a type is forward indexable, an inverted index will never be recommended even if it's possible and optimal to create one.

For now, I'm going to propose that we accept the regression (array inverted indexes will no longer be recommended). We already do not recommend trigram inverted indexes for this same reason.

I've opened #91777 to track this problem. It's a little bit annoying and will require some light refactoring of the index recommender, which currently uses data structures that do not permit adding the required extra bit (inverted or not) without changing the datatype (which is used in a bunch of different places).

@jordanlewis
Copy link
Copy Markdown
Member Author

Also, I added a commit that disables partitioning tables and indexes by array columns to avoid the problem that @ajwerner was talking about.

PTAL.

@jordanlewis jordanlewis marked this pull request as ready for review November 11, 2022 22:03
@jordanlewis jordanlewis requested review from a team as code owners November 11, 2022 22:03
If partitioning on array columns were supported, we'd need to make sure
to check the partition columns for enum types when enums are edited -
this is currently not implemented.

It's not clear if users want to be able to partition tables that are
indexed on arrays, so let's just not support it for the time being to
avoid the trouble.

Release note: None
Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 5 files at r2, 8 of 8 files at r4, 7 of 7 files at r5, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis)


pkg/sql/randgen/schema.go line 572 at r4 (raw file):

	// TODO(harding): Allow partitioning the primary index. This will require
	// massaging the syntax.
	if !isMultiRegion && !isPrimaryIndex && !partitioningNotSupported && len(prefix) > 0 && rng.Intn(10) == 0 {

Thank you for preemptively fixing randgen!


pkg/sql/logictest/testdata/logic_test/alter_table line 2674 at r5 (raw file):


statement ok
CREATE TABLE t1_non_indexable (n INT8);

nit: the name "t1_non_indexable" doesn't make sense any more 🙂 maybe "t1_indexable"?


pkg/sql/logictest/testdata/logic_test/array line 1461 at r5 (raw file):

CREATE INDEX i ON t(x DESC)

# Add "t@i" index annotation once #50659 is fixed.

nit: We should get rid of this todo (and probably do it, right? I guess we do this below, so maybe it's not necessary here).


pkg/sql/logictest/testdata/logic_test/array line 1469 at r5 (raw file):

{NULL}

# Add "t@i" index annotation once #50659 is fixed.

This one as well.


pkg/sql/logictest/testdata/logic_test/distsql_stats line 1946 at r5 (raw file):

ORDER BY statistics_name, column_names::STRING
----
NULL  {a}      10000  true

Should there be two rows for {a} here, one with the forward histogram and one with the inverted histogram?

Hmm, now that I'm trying this with strings it looks like we only get the forward histogram when both indexes exist. I remember in #84927 you fixed statistics_builder to handle coexistence of both types of histogram, so I assumed ANALYZE would create both, but maybe that's not the case?

Well, this doesn't seem like a blocker anyway. We can fix this in another PR.

Copy link
Copy Markdown
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2)


pkg/sql/randgen/schema.go line 572 at r4 (raw file):

Previously, michae2 (Michael Erickson) wrote…

Thank you for preemptively fixing randgen!

Thankfully, this was actually caught by some randgen tests! No way I would have remembered otherwise.


pkg/sql/logictest/testdata/logic_test/alter_table line 2674 at r5 (raw file):

Previously, michae2 (Michael Erickson) wrote…

nit: the name "t1_non_indexable" doesn't make sense any more 🙂 maybe "t1_indexable"?

Fixed!


pkg/sql/logictest/testdata/logic_test/array line 1461 at r5 (raw file):

Previously, michae2 (Michael Erickson) wrote…

nit: We should get rid of this todo (and probably do it, right? I guess we do this below, so maybe it's not necessary here).

Thanks for noticing this. Removed the TODOs and added the hints. I guess it's a bit redundant but these also test range scans a little bit so I'll leave these cases.


pkg/sql/logictest/testdata/logic_test/array line 1469 at r5 (raw file):

Previously, michae2 (Michael Erickson) wrote…

This one as well.

Done.


pkg/sql/logictest/testdata/logic_test/distsql_stats line 1946 at r5 (raw file):

Previously, michae2 (Michael Erickson) wrote…

Should there be two rows for {a} here, one with the forward histogram and one with the inverted histogram?

Hmm, now that I'm trying this with strings it looks like we only get the forward histogram when both indexes exist. I remember in #84927 you fixed statistics_builder to handle coexistence of both types of histogram, so I assumed ANALYZE would create both, but maybe that's not the case?

Well, this doesn't seem like a blocker anyway. We can fix this in another PR.

Yeah, this appears to be broken.

Since it's not a regression, I'll file a new issue.

Copy link
Copy Markdown
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2)


pkg/sql/logictest/testdata/logic_test/distsql_stats line 1946 at r5 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

Yeah, this appears to be broken.

Since it's not a regression, I'll file a new issue.

#92036

Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis)


pkg/sql/logictest/testdata/logic_test/distsql_stats line 1946 at r5 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

#92036

Thank you!

@jordanlewis
Copy link
Copy Markdown
Member Author

Thanks for the review!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 17, 2022

Build failed:

This functionality has been unlocked since we added trigram indexes -
the blocker before was the ability to support both forward and inverted
indexes on a given type.

This commit simply permits forward indexes on arrays, where they were
previously not supported at all.

Release note (sql change): enable forward indexes on arrays
@jordanlewis
Copy link
Copy Markdown
Member Author

A randomized test failed. I needed to teach the random schema generator that arrays with types that are only inverted-indexable are themselves only inverted-indexable.

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 19, 2022

Build succeeded:

@craig craig bot merged commit b0daa5f into cockroachdb:master Nov 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql: allow ARRAYs in forward indexes

4 participants