Skip to content

sql: add not visible index to optimizer#85794

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
wenyihu6:4-add-optimizer
Aug 13, 2022
Merged

sql: add not visible index to optimizer#85794
craig[bot] merged 1 commit intocockroachdb:masterfrom
wenyihu6:4-add-optimizer

Conversation

@wenyihu6
Copy link
Copy Markdown
Contributor

@wenyihu6 wenyihu6 commented Aug 9, 2022

This commit adds the logic of the invisible index feature to the optimizer.
After this commit has been merged, the invisible index feature should be fully
functional with CREATE INDEX and CREATE TABLE.

Tested with TPC-H: marking indexes invisible produced identical query plans as dropping indexes.

Screen Shot 2022-09-09 at 1 24 03 PM

Assists: #72576

See also: #85239

Release note (sql change): creating a not visible index using
CREATE TABLE …(INDEX … NOT VISIBLE) or
CREATE INDEX … NOT VISIBLE is now supported.

@wenyihu6 wenyihu6 requested a review from a team August 9, 2022 01:43
@wenyihu6 wenyihu6 requested a review from a team as a code owner August 9, 2022 01:43
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

@wenyihu6 wenyihu6 requested review from mgartner and michae2 August 9, 2022 01:43
@wenyihu6
Copy link
Copy Markdown
Contributor Author

wenyihu6 commented Aug 9, 2022

I split the other PR #85354. This PR contains changes in the optimizer, and the other PR only has notice related code.

wenyihu6 added a commit to wenyihu6/cockroach that referenced this pull request Aug 9, 2022
Optimizer now supports creating invisible indexes after this [PR](cockroachdb#85794). An important
use case for not visible indexes is to test the behaviour of dropping an index;
users can mark the index as invisible first, wait for a few weeks to measure the
impact, and then drop the index if no drop in performance is observed. However,
there are certain cases where users cannot expect dropping an index to behave
exactly the same as marking an index invisible. More specifically, NotVisible
indexes may still be used to police unique or foreign key constraint check
behind the scene. In those cases, dropping the index might behave different from
marking the index invisible. Prior to this commit, users do not know about this
without reading the documentation. This commit adds some user-friendly notices
when users are dropping a not visible index that might be helpful for constraint
check.

There are two cases where we are giving this notice:

More details on how this decision was made in
docs/RFCS/20220628_invisible_index.md.

Assists: cockroachdb#72576

See also: cockroachdb#85794

Release note (sql change): when users drop a not visible index that might be
used for constraint check, there will be a user-friendly notice.
Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @michae2 and @wenyihu6)


pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 192 at r1 (raw file):

# After making idx_v_invisible invisible, SELECT ignores idx_v_invisible.
query T
EXPLAIN SELECT v FROM t1 WHERE v = 2

nit: add a test just like this with the index forced - you already have a similar tests for partial indexes below, but I think it'd be nice to have one for non-partial indexes here.


pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 391 at r1 (raw file):

#################################################################################
# These tests check for some other cases called from buildDataSource (buildJoin,
# buildInputForUpdate, buildFromWithLateral).

I'm not sure I understand the motivation for testing based on the sources. These functions should end up calling buildScan in the optbuilder phase. A more interesting test might be testing lookup join, since they also iterate over seconary indexes. For example:

query T
EXPLAIN (VERBOSE) SELECT * FROM abc JOIN def ON f = b
----
distribution: local
vectorized: true
·
• lookup join (inner)
│ columns: (a, b, c, d, e, f)
│ estimated row count: 99
│ table: def@def_pkey
│ equality: (b) = (f)
└── • scan
columns: (a, b, c)
estimated row count: 100 (100% of the table; stats collected <hidden> ago)
table: abc@abc_pkey
spans: FULL SCAN

Generation of zigzag joins and merge joins also iterate over secondary indexes, so those could be interesting tests too.


pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 793 at r1 (raw file):

# c_update_idx_invisible is used for constraint check (delete on a parent table requires FK check).
query T
EXPLAIN DELETE FROM parent where p = 2

There's a delete cascade fast path that we should test here. Notice the slight difference in the two query plans below. The first's fk-cascade does not scan buffer 1:

defaultdb> create table p (p int primary key, i int);
CREATE TABLE

defaultdb> create table c (c int primary key, p int references
p(p) on delete cascade);
CREATE TABLE

defaultdb> explain delete from p where p = 1;
           info
--------------------------
  distribution: local
  vectorized: true

  • root
  │
  ├── • delete range
  │     from: p
  │     spans: [/1 - /1]
  │
  └── • fk-cascade
        fk: c_p_fkey
(11 rows)

defaultdb> explain delete from p where p = 1 and i = 2;
                   info
------------------------------------------
  distribution: local
  vectorized: true

  • root
  │
  ├── • delete
  │   │ from: p
  │   │
  │   └── • buffer
  │       │ label: buffer 1
  │       │
  │       └── • filter
  │           │ filter: i = 2
  │           │
  │           └── • scan
  │                 missing stats
  │                 table: p@p_pkey
  │                 spans: [/1 - /1]
  │
  └── • fk-cascade
        fk: c_p_fkey
        input: buffer 1

I think you'll need to run this delete before adding the child_update table to get the delete fast path to trigger.


pkg/sql/opt/xform/scan_index_iter.go line 230 at r1 (raw file):

		} else {
			// If we are not forcing any specific index and not visible index feature is
			// enabled here, ignore not visible indexes only.

nit: remove "only"

@wenyihu6
Copy link
Copy Markdown
Contributor Author

pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 192 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: add a test just like this with the index forced - you already have a similar tests for partial indexes below, but I think it'd be nice to have one for non-partial indexes here.

I think there is a case like this below (the first case under force index)

EXPLAIN SELECT v FROM t1@idx_v_invisible WHERE v = 2

@wenyihu6
Copy link
Copy Markdown
Contributor Author

wenyihu6 commented Aug 11, 2022

pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 391 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I'm not sure I understand the motivation for testing based on the sources. These functions should end up calling buildScan in the optbuilder phase. A more interesting test might be testing lookup join, since they also iterate over seconary indexes. For example:

query T
EXPLAIN (VERBOSE) SELECT * FROM abc JOIN def ON f = b
----
distribution: local
vectorized: true
·
• lookup join (inner)
│ columns: (a, b, c, d, e, f)
│ estimated row count: 99
│ table: def@def_pkey
│ equality: (b) = (f)
└── • scan
columns: (a, b, c)
estimated row count: 100 (100% of the table; stats collected <hidden> ago)
table: abc@abc_pkey
spans: FULL SCAN

Generation of zigzag joins and merge joins also iterate over secondary indexes, so those could be interesting tests too.

My initial intention was that we have covered most of places where buildScan was called now. buildDataSource was the only function where some cases were not covered like (buildJoin, buildFromWithLateral). I've removed these cases and added more cases to cover join (merge join, lookup join, cross join, inverted join) and one case for lateral subquery

@wenyihu6
Copy link
Copy Markdown
Contributor Author

pkg/sql/opt/exec/execbuilder/testdata/not_visible_index line 793 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

There's a delete cascade fast path that we should test here. Notice the slight difference in the two query plans below. The first's fk-cascade does not scan buffer 1:

defaultdb> create table p (p int primary key, i int);
CREATE TABLE

defaultdb> create table c (c int primary key, p int references
p(p) on delete cascade);
CREATE TABLE

defaultdb> explain delete from p where p = 1;
           info
--------------------------
  distribution: local
  vectorized: true

  • root
  │
  ├── • delete range
  │     from: p
  │     spans: [/1 - /1]
  │
  └── • fk-cascade
        fk: c_p_fkey
(11 rows)

defaultdb> explain delete from p where p = 1 and i = 2;
                   info
------------------------------------------
  distribution: local
  vectorized: true

  • root
  │
  ├── • delete
  │   │ from: p
  │   │
  │   └── • buffer
  │       │ label: buffer 1
  │       │
  │       └── • filter
  │           │ filter: i = 2
  │           │
  │           └── • scan
  │                 missing stats
  │                 table: p@p_pkey
  │                 spans: [/1 - /1]
  │
  └── • fk-cascade
        fk: c_p_fkey
        input: buffer 1

I think you'll need to run this delete before adding the child_update table to get the delete fast path to trigger.

I think this case is triggering fast path because the filter on p got transferred to the cascade constraint. I added another case to cover when fast path is not triggered.

Copy link
Copy Markdown
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong: 💯 Nice work, Wenyi!

Reviewed 2 of 4 files at r1, 1 of 1 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @wenyihu6)

This commit adds the logic of the invisible index feature to the optimizer.
After this commit has been merged, the invisible index feature should be fully
functional with `CREATE INDEX` and `CREATE TABLE`.

Assists: cockroachdb#72576

See also: cockroachdb#85239

Release note (sql change): creating a not visible index using
`CREATE TABLE …(INDEX … NOT VISIBLE)` or
`CREATE INDEX … NOT VISIBLE` is now supported.
wenyihu6 added a commit to wenyihu6/cockroach that referenced this pull request Aug 12, 2022
This commit adds parsing support for ALTER INDEX … [VISIBLE | NOT VISIBLE].
Executing the command returns an `unimplemented error`.

Assists: cockroachdb#72576

See also: cockroachdb#85794

Release note (sql change): Parser now supports altering an index to visible or
not visible. But no implementation has done yet, and executing it returns an
“unimplemented” error immediately.

# Conflicts:
#	pkg/sql/sem/tree/stmt.go
@wenyihu6
Copy link
Copy Markdown
Contributor Author

wenyihu6 commented Aug 12, 2022

TFTRs!

bors r+

wenyihu6 added a commit to wenyihu6/cockroach that referenced this pull request Aug 12, 2022
This commit adds parsing support for ALTER INDEX … [VISIBLE | NOT VISIBLE].
Executing the command returns an `unimplemented error`.

Assists: cockroachdb#72576

See also: cockroachdb#85794

Release note (sql change): Parser now supports altering an index to visible or
not visible. But no implementation has done yet, and executing it returns an
“unimplemented” error immediately.

# Conflicts:
#	pkg/sql/sem/tree/stmt.go
wenyihu6 added a commit to wenyihu6/cockroach that referenced this pull request Aug 12, 2022
This commit adds parsing support for ALTER INDEX … [VISIBLE | NOT VISIBLE].
Executing the command returns an `unimplemented error`.

Assists: cockroachdb#72576

See also: cockroachdb#85794

Release note (sql change): Parser now supports altering an index to visible or
not visible. But no implementation has done yet, and executing it returns an
“unimplemented” error immediately.

# Conflicts:
#	pkg/sql/sem/tree/stmt.go
@wenyihu6
Copy link
Copy Markdown
Contributor Author

retrying ...

bors r+

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 13, 2022

Build failed (retrying...):

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Aug 13, 2022

Build succeeded:

@craig craig bot merged commit 0dd438d into cockroachdb:master Aug 13, 2022
wenyihu6 added a commit to wenyihu6/cockroach that referenced this pull request Aug 13, 2022
This commit adds parsing support for ALTER INDEX … [VISIBLE | NOT VISIBLE].
Executing the command returns an `unimplemented error`.

Assists: cockroachdb#72576

See also: cockroachdb#85794

Release note (sql change): Parser now supports altering an index to visible or
not visible. But no implementation has done yet, and executing it returns an
“unimplemented” error immediately.

# Conflicts:
#	pkg/sql/sem/tree/stmt.go
craig bot pushed a commit that referenced this pull request Aug 13, 2022
85473: sql: add ALTER INDEX … NOT VISIBLE to parser r=wenyihu6 a=wenyihu6

This commit adds parsing support for ALTER INDEX … [VISIBLE | NOT VISIBLE].
Executing the command returns an `unimplemented error`.

Assists: #72576

See also: #85794

Release note (sql change): Parser now supports altering an index to visible or
not visible. But no implementation has done yet, and executing it returns an
“unimplemented” error immediately.

Co-authored-by: wenyihu3 <wenyi.hu@cockroachlabs.com>
wenyihu6 added a commit to wenyihu6/cockroach that referenced this pull request Aug 22, 2022
Optimizer now supports creating invisible indexes after this
[PR](cockroachdb#85794). An important use case
for not visible indexes is to test the behaviour of dropping an index by marking
the index invisible. However, there are certain cases where users cannot expect
dropping an index to behave exactly the same as marking an index invisible. More
specifically, NotVisible indexes may still be used to police unique or foreign
key constraint check behind the scene. In those cases, dropping the index might
behave different from marking the index invisible. Prior to this commit, users
do not know about this without reading the documentation. This commit adds some
user-friendly notices when users are dropping or changing a not visible index
that might be helpful for constraint check.

There are two cases where we are giving this notice: 1. if this index is unique.
2. if this index is on child table and may help with FK check.

More details on how this decision was made in
docs/RFCS/20220628_invisible_index.md.

Assists: cockroachdb#72576

See also: cockroachdb#85794

Release justification: low risk to the existing functionality; this commit just
adds notices.

Release note: none
wenyihu6 added a commit to wenyihu6/cockroach that referenced this pull request Aug 24, 2022
Optimizer now supports creating invisible indexes after this
[PR](cockroachdb#85794). An important use case
for not visible indexes is to test the behaviour of dropping an index by marking
the index invisible. However, there are certain cases where users cannot expect
dropping an index to behave exactly the same as marking an index invisible. More
specifically, NotVisible indexes may still be used to police unique or foreign
key constraint check behind the scene. In those cases, dropping the index might
behave different from marking the index invisible. Prior to this commit, users
do not know about this without reading the documentation. This commit adds some
user-friendly notices when users are dropping or changing a not visible index
that might be helpful for constraint check.

There are two cases where we are giving this notice: 1. if this index is unique.
2. if this index is on child table and may help with FK check.

More details on how this decision was made in
docs/RFCS/20220628_invisible_index.md.

Assists: cockroachdb#72576

See also: cockroachdb#85794

Release justification: low risk to the existing functionality; this commit just
adds notices.

Release note: none
wenyihu6 added a commit to wenyihu6/cockroach that referenced this pull request Aug 30, 2022
Optimizer now supports creating invisible indexes after this
[PR](cockroachdb#85794). An important use case
for not visible indexes is to test the behaviour of dropping an index by marking
the index invisible. However, there are certain cases where users cannot expect
dropping an index to behave exactly the same as marking an index invisible. More
specifically, NotVisible indexes may still be used to police unique or foreign
key constraint check behind the scene. In those cases, dropping the index might
behave different from marking the index invisible. Prior to this commit, users
do not know about this without reading the documentation. This commit adds some
user-friendly notices when users are dropping or changing a not visible index
that might be helpful for constraint check.

There are two cases where we are giving this notice: 1. if this index is unique.
2. if this index is on child table and may help with FK check.

More details on how this decision was made in
docs/RFCS/20220628_invisible_index.md.

Assists: cockroachdb#72576

See also: cockroachdb#85794

Release justification: low risk to the existing functionality; this commit just
adds notices.

Release note: none
wenyihu6 added a commit to wenyihu6/cockroach that referenced this pull request Aug 31, 2022
Optimizer now supports creating invisible indexes after this
[PR](cockroachdb#85794). An important use case
for not visible indexes is to test the behaviour of dropping an index by marking
the index invisible. However, there are certain cases where users cannot expect
dropping an index to behave exactly the same as marking an index invisible. More
specifically, NotVisible indexes may still be used to police unique or foreign
key constraint check behind the scene. In those cases, dropping the index might
behave different from marking the index invisible. Prior to this commit, users
do not know about this without reading the documentation. This commit adds some
user-friendly notices when users are dropping or changing a not visible index
that might be helpful for constraint check.

There are two cases where we are giving this notice: 1. if this index is unique.
2. if this index is on child table and may help with FK check.

More details on how this decision was made in
docs/RFCS/20220628_invisible_index.md.

Assists: cockroachdb#72576

See also: cockroachdb#85794

Release justification: low risk to the existing functionality; this commit just
adds notices.

Release note: none
craig bot pushed a commit that referenced this pull request Aug 31, 2022
…87158

85354: sql: notices for NotVisible Indexes r=wenyihu6 a=wenyihu6

Optimizer now supports creating invisible indexes after this
[PR](#85794). An important use case
for not visible indexes is to test the behaviour of dropping an index by marking
the index invisible. However, there are certain cases where users cannot expect
dropping an index to behave exactly the same as marking an index invisible. More
specifically, NotVisible indexes may still be used to police unique or foreign
key constraint check behind the scene. In those cases, dropping the index might
behave different from marking the index invisible. Prior to this commit, users
do not know about this without reading the documentation. This commit adds some
user-friendly notices when users are dropping or changing a not visible index
that might be helpful for constraint check.

There are two cases where we are giving this notice: 1. if this index is unique.
2. if this index is on child table and may help with FK check.

More details on how this decision was made in
docs/RFCS/20220628_invisible_index.md.

Assists: #72576

See also: #85794

Release justification: low risk to the existing functionality; this commit just
adds notices.

Release note: none

86592: kvserver: rework memory allocation in replicastats r=kvoli a=kvoli

This patch removes some unused fields within the replica stats object.
It also opts to allocate all the memory needed upfront for a replica
stats object for better cache locality and less GC overhead.

This patch also removes locality tracking for the other throughput trackers
to reduce per-replica memory footprint.

resolves #85112

Release justification: low risk, lowers memory footprint to avoid oom.
Release note: None

87024: sql: Prevent primary region being same as secondary region r=rafiss a=e-mbrown

fixes #86879

We found that the primary region could be assigned the same region as the secondary region. This commit adds an error to prevent that.

Release justification: Low risk high benefit change to existing functionality
Release note: None

87110: ui: fixes to high contention copy in insight workload pages r=ericharmeling a=ericharmeling

Previously, the High Contention insight type was labeled
"High Contention Time", and the waiting transactions list
was labeled in the incorrect tense. This commit fixes those
typos.

Release justification: bug fix
Release note: None

87135: build: remove newly-added node_modules/ trees in ui-maintainer-clean r=rickystewart a=sjbarag

A few recent features [1, 2] introduced new node_modules/ trees for
dependencies, but didn't update the ui-maintainer-clean Make target to
remove them. This allowed those directories to leak between TeamCity
builds with Docker user permissions, preventing a `yarn install` in
those packages from properly laying out a `node_modules/.bin` directory
for executables like `tsc`. Remove the recently-introduced
`node_modules/` directories as part of `make ui-maintainer-clean`, to
restore a clean state between jobs.

[1] d28c072 (ui: add eslint-plugin-crdb package with custom eslint rules, 2022-05-27)
[2] c58279d (ui: reintroduce end-to-end UI tests with cypress, 2022-08-12)

Release justification: Non-production code changes

87149: sql: clean up physical planning for system tenant r=yuzefovich a=yuzefovich

This commit audits a couple of methods around the health and version of
DistSQL nodes that are used only for the system tenant to make that more
explicit. Additionally, it unexports `NodeStatuses` map from the
planning context as well as removes some unnecessary short-circuiting
behavior around checking the node health and version (it was unnecessary
because we already short-circuit in
`checkInstanceHealthAndVersionSystem`).

Release justification: low-risk cleanup.

Release note: None

87153: ui: ux improvements on stmt details page r=maryliag a=maryliag

This commit adds a few improvements and bug fixes:

- Handles the case where we hit a
timeout on statement details, so it doesn't crash
anymore and you can still see the time picker to
be able to select a new time interval.

- Updates the error message, to
clarify it was a timeout error and increase the
timeout from 30s to 30m on the details endpoint.
Fixes #78979

- Updates the last error for statement
details with the proper value, which previously
was using the error for all statements endpoint,
instead of the specific for that fingerprint id.

- Adds a message when page takes longer to load.

- Uses a proper count formatting for
execution count.

Release justification: bug fixes and smaller improvements
Release note (ui change): Proper formatting of execution count
under Statement Details page.
Increase timeout for Statement Details page and shows
proper timeout error when it happens, no longer
crashing the page.

87155: github-post: allow for finding the test in a parent directory of the pkg r=srosenberg,rail a=rickystewart

In some cases the Bazel test runner "incorrectly" reports the package
path for tests. For example, we have [issues](#85376) where the name of
the test is reported as `pkg/.../package/package_test` rather than
`pkg/.../package` as we might expect. I suspect this is confusing
`github-post` when it tries to find tests in the `package_test`
directory rather than the `package` directory.

We address this by allowing `github-post` to search up the directory
tree for the test rather than expecting it to be in one particular
directory.

Also update a repro command to use `dev test` rather than
`make stressrace`.

Closes #85420.

Release justification: Non-production code changes
Release note: None

87156: ci: disable sharding in random syntax tests r=srosenberg a=rickystewart

The different shards were trampling each other's test.json.txt,
preventing failures from being reported accurately.

Release justification: Non-production code changes
Release note: None

87158: sql: clean up node dialer fields r=yuzefovich a=yuzefovich

This commit removes no longer used `nodeDialer` field (for SQL - KV
communication) as well as renames some of the similarly named fields to
`podNodeDialer` to indicate that its only a SQL - SQL dialer.

Release justification: low-risk cleanup.

Release note: None

Co-authored-by: wenyihu3 <wenyi.hu@cockroachlabs.com>
Co-authored-by: Austen McClernon <austen@cockroachlabs.com>
Co-authored-by: e-mbrown <ebsonari@gmail.com>
Co-authored-by: Eric Harmeling <eric.harmeling@cockroachlabs.com>
Co-authored-by: Sean Barag <barag@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: Marylia Gutierrez <marylia@cockroachlabs.com>
Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
@wenyihu6 wenyihu6 deleted the 4-add-optimizer branch September 1, 2022 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants