Skip to content

sql: notices for NotVisible Indexes#85354

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
wenyihu6:4-add-optimizer-logic&test-new
Aug 31, 2022
Merged

sql: notices for NotVisible Indexes#85354
craig[bot] merged 1 commit intocockroachdb:masterfrom
wenyihu6:4-add-optimizer-logic&test-new

Conversation

@wenyihu6
Copy link
Copy Markdown
Contributor

@wenyihu6 wenyihu6 commented Jul 30, 2022

Optimizer now supports creating invisible indexes after this
PR. An important use case
for not visible indexes is to test the behaviour of dropping an index by marking
the index invisible. However, there are certain cases where users cannot expect
dropping an index to behave exactly the same as marking an index invisible. More
specifically, NotVisible indexes may still be used to police unique or foreign
key constraint check behind the scene. In those cases, dropping the index might
behave different from marking the index invisible. Prior to this commit, users
do not know about this without reading the documentation. This commit adds some
user-friendly notices when users are dropping or changing a not visible index
that might be helpful for constraint check.

There are two cases where we are giving this notice: 1. if this index is unique.
2. if this index is on child table and may help with FK check.

More details on how this decision was made in
docs/RFCS/20220628_invisible_index.md.

Assists: #72576

See also: #85794

Release justification: low risk to the existing functionality; this commit just
adds notices.

Release note: none

@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@wenyihu6 wenyihu6 force-pushed the 4-add-optimizer-logic&test-new branch from 598a6bd to ee32fd4 Compare July 30, 2022 00:21
wenyihu6 added a commit to wenyihu6/cockroach that referenced this pull request Jul 30, 2022
This commit adds a flag to ScanFlags to indicate when the invisible index
feature should be enabled in buildScan. The invisible index feature should be
enabled by default but temporarily disabled during any unique or foreign key
constraint check. More details on how the decision was made and which flag to
pass in `docs/RFCS/20220628_invisible_index.md`. Note that no logic has been
added to the optimizer yet. This commit just passes the correct flag to
buildScan and shows the effect of the flag in the output of test data.

Assists: cockroachdb#72576

See also: cockroachdb#85354

Release note: none
@wenyihu6 wenyihu6 force-pushed the 4-add-optimizer-logic&test-new branch from ee32fd4 to dabf081 Compare July 30, 2022 01:02
@wenyihu6 wenyihu6 marked this pull request as ready for review July 30, 2022 01:03
@wenyihu6 wenyihu6 requested a review from a team as a code owner July 30, 2022 01:03
@wenyihu6 wenyihu6 requested review from a team and rhu713 July 30, 2022 01:03
@wenyihu6
Copy link
Copy Markdown
Contributor Author

The first commit comes from #84912. The second commit comes from #85239.

@wenyihu6 wenyihu6 force-pushed the 4-add-optimizer-logic&test-new branch 2 times, most recently from 00f42fc to 4240038 Compare August 1, 2022 10:19
@wenyihu6 wenyihu6 requested review from mgartner and michae2 August 1, 2022 14:06
@wenyihu6 wenyihu6 force-pushed the 4-add-optimizer-logic&test-new branch from 4240038 to d6cdc75 Compare August 2, 2022 13:30
@wenyihu6 wenyihu6 force-pushed the 4-add-optimizer-logic&test-new branch 2 times, most recently from 7b009ce to 6ecef23 Compare August 2, 2022 13:45
@wenyihu6 wenyihu6 force-pushed the 4-add-optimizer-logic&test-new branch 3 times, most recently from b8f2b1b to 804563c Compare August 2, 2022 16:30
@wenyihu6 wenyihu6 force-pushed the 4-add-optimizer-logic&test-new branch from 804563c to 259a7a1 Compare August 2, 2022 16:55
@wenyihu6
Copy link
Copy Markdown
Contributor Author

wenyihu6 commented Aug 9, 2022

pkg/sql/create_index.go line 745 at r12 (raw file):

Previously, michae2 (Michael Erickson) wrote…

But based on the code, it is checking if originColIDs is the prefix of the index's KeyColumnIDs. I'm not sure if that's what we want. Consider the following case: a, b might not be a, c, b's prefix, but it's still useful for constraint check that reference column a, b.

Great example. Yes, looks like IsValidOriginIndex will miss some useful indexes. (I suspect it was written back when FK checks were more picky about the indexes they used.)

If originColIDs is not an ordered list but just in the same order that the user specified, this might cause problems mentioned in the screenshot.

Yes, I think it's just the order the user specified.

I'm not exactly sure how IsValidOriginIndex and HasPrefix are being used right now; I took a quick look at it, but I need more time to understand better.

Looks like IsValidOriginIndex is currently only used in one place, this check in DROP INDEX:

// Check for foreign key mutations referencing this index.
for _, m := range tableDesc.Mutations {
if c := m.GetConstraint(); c != nil &&
c.ConstraintType == descpb.ConstraintToUpdate_FOREIGN_KEY &&
// If the index being deleted could be used as a index for this outbound
// foreign key mutation, then make sure that we have another index that
// could be used for this mutation.
idx.IsValidOriginIndex(c.ForeignKey.OriginColumnIDs) &&
!indexHasReplacementCandidate(func(idx catalog.Index) bool {
return idx.IsValidOriginIndex(c.ForeignKey.OriginColumnIDs)
}) {
return pgerror.Newf(pgcode.ObjectNotInPrerequisiteState,
"referencing constraint %q in the middle of being added, try again later", c.ForeignKey.Name)
}
}

I'm not sure if this is supposed to happen. When the table gets really large, will it ever favour the the secondary index and perform a look up join?

Sometimes it will. Here's an example: https://gist.github.com/michae2/2bf255b183b529bea28f25aa82b5f561 which results in this for me:

                                                 info
------------------------------------------------------------------------------------------------------
  distribution: local
  vectorized: true

  • root
  │
  ├── • update
  │   │ table: p
  │   │ set: y
  │   │
  │   └── • buffer
  │       │ label: buffer 1
  │       │
  │       └── • render
  │           │
  │           └── • filter
  │               │ estimated row count: 1
  │               │ filter: y = 3
  │               │
  │               └── • scan
  │                     estimated row count: 100 (100% of the table; stats collected 20 minutes ago)
  │                     table: p@p_pkey
  │                     spans: FULL SCAN
  │
  └── • constraint-check
      │
      └── • error if rows
          │
          └── • lookup join (semi)
              │ estimated row count: 1
              │ table: t@t_pkey
              │ equality: (rowid) = (rowid)
              │ equality cols are key
              │ pred: x = a
              │
              └── • lookup join
                  │ estimated row count: 1
                  │ table: t@t_b_idx
                  │ equality: (y) = (b)
                  │
                  └── • except
                      │ estimated row count: 1
                      │
                      ├── • scan buffer
                      │     estimated row count: 1
                      │     label: buffer 1
                      │
                      └── • scan buffer
                            estimated row count: 1
                            label: buffer 1

So I don't think we need KeyColumnIDs to cover originColIDs. An intersection will work, as you've done. Or checking that the prefix of KeyColumnIDs is in originColIDs would also work, as @mgartner pointed out.

Sorry if I am making things really confusing here.

Not confusing at all! This is great! 🙂

Thanks a lot for the example! That cleared things up. The rest sounds good to me as well. We can discuss this in more details during our meeting tomorrow.

@wenyihu6
Copy link
Copy Markdown
Contributor Author

wenyihu6 commented Aug 9, 2022

I left a few comments on the notice-related logic that might be helpful, but we should talk about it IRL.

Great work on this! Nice to see everything coming together!

All the NOTICE-related changes would be a good candidate to break into a separate commit or PR. It's up to you whether you want to split them up now, but something to keep in mind for future PRs.

I will keep in mind. Sure! I can split them up; I will keep this PR for notice related changes and open a new PR with changes in the optimizer.

@wenyihu6 wenyihu6 force-pushed the 4-add-optimizer-logic&test-new branch from 8549ad2 to c4b1d96 Compare August 9, 2022 02:05
@wenyihu6 wenyihu6 changed the title sql: add not visible index to optimizer sql: notices for NotVisible Indexes Aug 9, 2022
@wenyihu6 wenyihu6 force-pushed the 4-add-optimizer-logic&test-new branch from c4b1d96 to 0e667cb Compare August 9, 2022 02:11
@wenyihu6 wenyihu6 closed this Aug 9, 2022
@wenyihu6 wenyihu6 deleted the 4-add-optimizer-logic&test-new branch August 9, 2022 02:13
@wenyihu6 wenyihu6 restored the 4-add-optimizer-logic&test-new branch August 9, 2022 02:32
@wenyihu6 wenyihu6 reopened this Aug 9, 2022
@wenyihu6 wenyihu6 force-pushed the 4-add-optimizer-logic&test-new branch from 0e667cb to 6c6eaab Compare August 9, 2022 02:46
@wenyihu6 wenyihu6 marked this pull request as draft August 12, 2022 11:55
@wenyihu6 wenyihu6 force-pushed the 4-add-optimizer-logic&test-new branch from 6c6eaab to 5fcecc7 Compare August 22, 2022 17:04
@wenyihu6
Copy link
Copy Markdown
Contributor Author

wenyihu6 commented Aug 22, 2022

I updated the logic for notices. I'm not sure if we want this for this release. But I'm happy to close this PR, and you can decide if this is needed later on. @mgartner @michae2

@michae2
Copy link
Copy Markdown
Collaborator

michae2 commented Aug 22, 2022

I'm in favor of getting this into the release, WDYT @mgartner?

@wenyihu6
Copy link
Copy Markdown
Contributor Author

pkg/sql/catalog/descpb/index.go line 73 at r14 (raw file):

Previously, wenyihu6 (Wenyi Hu) wrote…

That's a good idea. We should also look into where HasPrefix are used. Their use cases have been really similar to the logic in IsValidOriginIndex .

I ended up leaving IsValidOriginIndex just as the way it is because I was worried about breaking the logic now. But we can update this for future release if needed. I will create an issue for this.

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of getting this into the release, WDYT @mgartner?

Yes, let's try to get it in.

Reviewed 1 of 37 files at r8, 16 of 26 files at r9, 1 of 10 files at r10, 1 of 9 files at r12, 46 of 56 files at r13, 2 of 11 files at r16, 12 of 12 files at r17, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @rhu713, and @wenyihu6)


pkg/sql/catalog/descpb/index.go line 73 at r14 (raw file):

Previously, wenyihu6 (Wenyi Hu) wrote…

That's a good idea. We should also look into where HasPrefix are used. Their use cases have been really similar to the logic in IsValidOriginIndex .

The error returned in that DROP INDEX case is also confusing to me. I'm questioning


pkg/sql/catalog/descpb/index.go line 67 at r17 (raw file):

}

// IsHelpfulOriginIndex returns whether the index may become a helpful index for

nit: "become" => "be"


pkg/sql/catalog/descpb/index.go line 72 at r17 (raw file):

// column covered by the index is present in FK constraint columns.
func (desc *IndexDescriptor) IsHelpfulOriginIndex(originColIDs ColumnIDs) bool {
	return !desc.IsPartial() && ColumnIDs(desc.KeyColumnIDs).HasFirstElementIn(originColIDs)

nit: HasFirstElementIn is so simple I don't think a helper function is necessary. You can replace with len(desc.KeyColumnIDs) > 0 && originColIDs.Contains(desc.KeyColumnIDs[0]).


pkg/sql/catalog/tabledesc/validate.go line 780 at r17 (raw file):

	index catalog.Index, tableDesc catalog.TableDescriptor,
) pgnotice.Notice {
	noticeStr := "not visible indexes may still used to police constraint check"

nit: police => enforce

Can we have more specific notices for two cases. For example:

  1. "some queries may read from a unique, not visible index to enforce uniqueness of the columns"
  2. "some queries may read from a not visible index to enforce foreign key constraints"

pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 1390 at r17 (raw file):


query T noticetrace
CREATE UNIQUE INDEX idx_invisible_unique ON t1(p) VISIBLE

A statement ok for the CREATE TABLE and CREATE UNIQUE INDEX makes more sense here since you're not testing the notices of them. You can also combine them into the same statement (a single CREATE TABLE) or statement ok block.


pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 1403 at r17 (raw file):

DROP INDEX idx_invisible_unique
----
NOTICE: not visible indexes may still used to police constraint check

This notice seems out of place - it doesn't make sense in this context. A dropped index cannot be used to enforce a constraint.


pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 1422 at r17 (raw file):

DROP TABLE t1

query T noticetrace

statement ok


pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 1426 at r17 (raw file):

----

query T noticetrace

statement ok


pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 1427 at r17 (raw file):


query T noticetrace
CREATE TABLE child (c INT PRIMARY KEY, p INT NOT NULL REFERENCES parent(p), INDEX c_idx_invisible_helpful (p, c) NOT VISIBLE, INDEX c_idx_invisible_not_helpful (c, p) NOT VISIBLE)

Can you break this up into a multi-line statement to make it easier to read?


pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 1440 at r17 (raw file):

DROP INDEX c_idx_invisible_helpful
----
NOTICE: not visible indexes may still used to police constraint check

This notice is confusing in this context.

@wenyihu6
Copy link
Copy Markdown
Contributor Author

pkg/sql/catalog/tabledesc/validate.go line 780 at r17 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: police => enforce

Can we have more specific notices for two cases. For example:

  1. "some queries may read from a unique, not visible index to enforce uniqueness of the columns"
  2. "some queries may read from a not visible index to enforce foreign key constraints"

These two cases might not be exclusive since a unique index on a parent table might be useful for both uniqueness or foreign key check. The new changes include three types notices:

  1. unique index on a parent table that can be helpful for FK check and uniqueness check
  2. other unique indexes that can only be helpful for uniqueness check
  3. other indexes on a child table that can be helpful for FK check

A thing to note --- we are giving out these notices when users are changing an existing index to not visible or dropping an invisible index but not when users execute things like alter table ... add constraint. It is possible that the user adds a foreign key constraint later on. These invisible indexes might not be useful now but may become useful later.

@wenyihu6
Copy link
Copy Markdown
Contributor Author

pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 300 at r14 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Is this supposed to be NOT VISIBLE?

Done.

@wenyihu6
Copy link
Copy Markdown
Contributor Author

pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 1440 at r17 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

This notice is confusing in this context.

Are they better now?

@wenyihu6
Copy link
Copy Markdown
Contributor Author

pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 724 at r14 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

If you use regular EXPLAIN without (OPT) it'll print information about the cascade which would be useful for these tests.

Done.

@wenyihu6
Copy link
Copy Markdown
Contributor Author

pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 302 at r14 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I think this test is supposed to check that the partial NOT VISIBLE index is not used.

Done.

Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should wait for @mgartner before checking this in. My opinions are:

  • I find the notices on DROP confusing. It seems like at the time of dropping it's already too late for a notice. My preference would be to only have these notices on ALTER.
  • I think the notices are a bit too verbose. I'd prefer it if we got rid of the "dropping this index may be different from marking this index not visible;" part.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @rhu713, and @wenyihu6)

@wenyihu6
Copy link
Copy Markdown
Contributor Author

Thanks for the reply! Yeah. We should wait for Marcus's stamp; he mentioned that he wants to have another look before merging it in our last meeting.

@wenyihu6
Copy link
Copy Markdown
Contributor Author

wenyihu6 commented Aug 26, 2022

Possible notice messages:

We need to decide when we want to give notices

  1. when ALTER INDEX... NOT VISIBLE
  2. when DROP INDEX...
  3. Both

Also, we need to decide what notices to give
There are three cases where invisible indexes might be different from dropping the index.

  1. Unique index to enforce uniqueness
  2. Index on child table to enforce FK
  3. Unique index on parent table to enforce uniqueness or FK

We can give out the same message for different cases:

  1. queries may still use not visible indexes to enforce constraints check
    Or give different messages for different cases:
  2. queries may still use not visible indexes to enforce uniqueness of columns
    queries may still use not visible indexes to enforce foreign key constraints
    queries may still use not visible indexes to enforce unique or foreign key constraints

In addition to the message above, we need to decide if we want to add the following message in front:

  1. dropping this index may be different from marking this index not visible
  2. Or not

My preference:

  1. Only give notices for ALTER INDEX since if they want to test the behavior of dropping an index, they will need to alter the index visible before dropping it. And this notice is already too late when they are dropping the index.
  2. If we are giving notices only for ALTER INDEX, I prefer "queries may still use not visible indexes to enforce constraints check"
  3. If we are giving notices for both ALTER INDEX and DROP INDEX, I prefer
    "dropping this index may be different from marking this index not visible because queries may still use not visible indexes to enforce constraints check"
    Giving out different messages for different cases is nice, but getting different messages executing the same SQL statement might lead to confusion. By the time the users are executing ALTER INDEX …, they might not remember if that index was unique or on parent or on child table any more. We can write “please read documentation[insert link] for more details” as well.

@mgartner
Copy link
Copy Markdown
Contributor

I agree with you both. My preference is a simple notice for ALTER INDEX ... NOT VISIBLE only with a message like "queries may still use not visible indexes to enforce unique and foreign key constraints".

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! :lgtm:

Reviewed 1 of 5 files at r18.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @rhu713, and @wenyihu6)

Optimizer now supports creating invisible indexes after this
[PR](cockroachdb#85794). An important use case
for not visible indexes is to test the behaviour of dropping an index by marking
the index invisible. However, there are certain cases where users cannot expect
dropping an index to behave exactly the same as marking an index invisible. More
specifically, NotVisible indexes may still be used to police unique or foreign
key constraint check behind the scene. In those cases, dropping the index might
behave different from marking the index invisible. Prior to this commit, users
do not know about this without reading the documentation. This commit adds some
user-friendly notices when users are dropping or changing a not visible index
that might be helpful for constraint check.

There are two cases where we are giving this notice: 1. if this index is unique.
2. if this index is on child table and may help with FK check.

More details on how this decision was made in
docs/RFCS/20220628_invisible_index.md.

Assists: cockroachdb#72576

See also: cockroachdb#85794

Release justification: low risk to the existing functionality; this commit just
adds notices.

Release note: none
@wenyihu6
Copy link
Copy Markdown
Contributor Author

TFTRs!

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 31, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants