sql: use inaccessible columns for expression indexes#66620
Conversation
RaduBerinde
left a comment
There was a problem hiding this comment.
I was playing around with the change but something isn't working:
root@127.67.216.62:26257/defaultdb> set experimental_enable_expression_indexes = true;
SET
Time: 0ms total (execution 0ms / network 0ms)
root@127.67.216.62:26257/defaultdb> create table t (a int, b int, index((a+b)));
ERROR: unimplemented: only simple columns are supported as index elements
SQLSTATE: 0A000
HINT: You have attempted to use a feature that is not yet implemented.
See: https://go.crdb.dev/issue-v/9682/v21.2
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner)
pkg/sql/opt_catalog.go, line 678 at r2 (raw file):
if col.IsInaccessible() { visibility = cat.Inaccessible }
[nit] maybe swap the order and use else if
pkg/sql/logictest/testdata/logic_test/expression_index, line 59 at r4 (raw file):
CREATE INDEX err ON t ((comp + 10)) statement error type of index element NULL is ambiguous.*\nHINT: consider adding a type cast to the expression
We should add at least one test with one of these errors when the index is defined directly in CREATE TABLE.
pkg/sql/logictest/testdata/logic_test/expression_index, line 75 at r4 (raw file):
statement error index element a \+ b of type int is not allowed as the last column in an inverted index.*\nHINT: see the documentation for more information about inverted indexes CREATE INVERTED INDEX err ON t (a, (a + b));
We should add a test with a tuple. ON (a, ((a+b))) or ON (a, (row (a,b))) might do it
RaduBerinde
left a comment
There was a problem hiding this comment.
Ah, maybe we don't yet support them in CREATE TABLE statements?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner)
fd68a86 to
db67862
Compare
mgartner
left a comment
There was a problem hiding this comment.
Ah, maybe we don't yet support them in CREATE TABLE statements?
Correct. I'll add that in #65703. I wanted to bring the CREATE INDEX behavior up to spec before reworking that PR.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde)
pkg/sql/opt_catalog.go, line 678 at r2 (raw file):
Previously, RaduBerinde wrote…
[nit] maybe swap the order and use
else if
Done.
pkg/sql/logictest/testdata/logic_test/expression_index, line 59 at r4 (raw file):
Previously, RaduBerinde wrote…
We should add at least one test with one of these errors when the index is defined directly in CREATE TABLE.
Absolutely. Will do in #65703.
pkg/sql/logictest/testdata/logic_test/expression_index, line 75 at r4 (raw file):
Previously, RaduBerinde wrote…
We should add a test with a tuple.
ON (a, ((a+b)))orON (a, (row (a,b)))might do it
Done.
RaduBerinde
left a comment
There was a problem hiding this comment.
Reviewed 9 of 15 files at r1, 2 of 22 files at r5, 10 of 10 files at r6, 8 of 10 files at r7, 5 of 5 files at r8.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner)
pkg/sql/exec_util.go, line 369 at r8 (raw file):
var experimentalExpressionIndexesMode = settings.RegisterBoolSetting( "sql.defaults.experimental_expression_indexes.enabled",
We should add the old setting to retiredSettings in settings package.
db67862 to
d897347
Compare
mgartner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde)
pkg/sql/exec_util.go, line 369 at r8 (raw file):
Previously, RaduBerinde wrote…
We should add the old setting to
retiredSettingsinsettingspackage.
Done. Looks like we forgot to do this for other retired experimental settings like the settings for partial indexes and partitioned inverted index 😬
d897347 to
42914ae
Compare
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 9 of 15 files at r1, 2 of 22 files at r5, 24 of 24 files at r9, 11 of 11 files at r10, 11 of 11 files at r11, 6 of 6 files at r12.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
pkg/sql/create_index.go, line 396 at r12 (raw file):
typ.Name(), ), "see the documentation for more information about inverted indexes",
is there a specific docs page we can link to?
pkg/sql/show_create.go, line 87 at r10 (raw file):
for i, col := range desc.PublicColumns() { // Inaccessible columns are not displayed in SHOW CREATE TABLE. if col.IsInaccessible() {
Is there a way to make these columns not get returned by desc.PublicColumns()? Or is that more trouble than it's worth?
pkg/sql/logictest/testdata/logic_test/pg_catalog, line 467 at r11 (raw file):
61 t6 2332901747 0 0 1546506610 2631952481 0 0 4179599057 primary 2332901747 0 0 1546506610 2631952481 0 0 4179599058 t6_crdb_internal_idx_expr_idx 2332901747 0 0 1546506610 2631952481 0 0
Are you planning to remove crdb_internal from the index name? Seems like it adds a lot of unnecessary clunkiness.
mgartner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @rytaft)
pkg/sql/create_index.go, line 396 at r12 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
is there a specific docs page we can link to?
Good idea. Added the URL to these errors messages in a new commit.
pkg/sql/show_create.go, line 87 at r10 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Is there a way to make these columns not get returned by
desc.PublicColumns()? Or is that more trouble than it's worth?
Great question. You might remember that I tried to do this when I originally attempted to create inaccessible columns about a month ago. It required some unsavory changes to the tabledesc.newColumnCache.
It seems that "public" designates where a column sits on the mutation axis: public, writable, deletable, not the visibility axis: visible, hidden, inaccessible. The documentation for the Column interface's Public method states that a public column is any "active, readable column in the table descriptor". I think the name "public" is causing confusion for us. Maybe "active" is a better description. I'm happy to make that change if we can get consensus on a more clear name.
For now, I've added an AccessibleColumns function that returns public, inaccessible=false columns. This cleans up a bit of code.
pkg/sql/logictest/testdata/logic_test/pg_catalog, line 467 at r11 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Are you planning to remove
crdb_internalfrom the index name? Seems like it adds a lot of unnecessary clunkiness.
Yes. I'm planning on following Postgres's naming convention for anonymous expression indexes in a future PR:
marcus=# CREATE TABLE t (a INT, b INT);
CREATE TABLE
marcus=# CREATE INDEX ON t ((a + b));
CREATE INDEX
marcus=# CREATE INDEX ON t ((a + b), (b + 100));
CREATE INDEX
marcus=# \d t
Table "public.t"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
a | integer | | |
b | integer | | |
Indexes:
"t_expr_expr1_idx" btree ((a + b), (b + 100))
"t_expr_idx" btree ((a + b))
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 5 of 5 files at r13, 6 of 6 files at r14.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner)
pkg/sql/show_create.go, line 87 at r10 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Great question. You might remember that I tried to do this when I originally attempted to create inaccessible columns about a month ago. It required some unsavory changes to the
tabledesc.newColumnCache.It seems that "public" designates where a column sits on the mutation axis: public, writable, deletable, not the visibility axis: visible, hidden, inaccessible. The documentation for the
Columninterface'sPublicmethod states that a public column is any "active, readable column in the table descriptor". I think the name "public" is causing confusion for us. Maybe "active" is a better description. I'm happy to make that change if we can get consensus on a more clear name.For now, I've added an
AccessibleColumnsfunction that returns public, inaccessible=false columns. This cleans up a bit of code.
Nice!
pkg/sql/catalog/tabledesc/table_desc.go, line 362 at r14 (raw file):
// VisibleColumns returns a slice of Column interfaces containing the // table's visible columns, in the canonical order. Visible columns are public // columns with Hidden=false.
Does this mean inaccessible columns would be included here? Is that desirable?
pkg/sql/logictest/testdata/logic_test/pg_catalog, line 467 at r11 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Yes. I'm planning on following Postgres's naming convention for anonymous expression indexes in a future PR:
marcus=# CREATE TABLE t (a INT, b INT); CREATE TABLE marcus=# CREATE INDEX ON t ((a + b)); CREATE INDEX marcus=# CREATE INDEX ON t ((a + b), (b + 100)); CREATE INDEX marcus=# \d t Table "public.t" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- a | integer | | | b | integer | | | Indexes: "t_expr_expr1_idx" btree ((a + b), (b + 100)) "t_expr_idx" btree ((a + b))
Sounds good
|
pkg/sql/create_index.go, line 396 at r14 (raw file):
nit: use docs.URL snoop otan out |
mgartner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @rytaft)
pkg/sql/create_index.go, line 396 at r14 (raw file):
Previously, otan (Oliver Tan) wrote…
nit: use docs.URL
snoop otan out
Brilliant. Thanks!
pkg/sql/catalog/tabledesc/table_desc.go, line 362 at r14 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Does this mean inaccessible columns would be included here? Is that desirable?
Good point. I don't think it is desirable. The only place this function is used is in the context of IMPORT statements, as far as I can tell. In those cases, it is used to determine which columns to write to. Inaccessible columns are all computed columns (at least for now), so it is invalid to write to them directly. I've updated the function so that it only returns hidden=false and inaccessible=false columns.
Following these code paths allowed to me to discover a bug whenIMPORT-ing into a table with virtual computed columns: #66694. I'm glad you drew our attention to this!
This commit replaces the wordy references to "expression-based indexes" with "expression indexes", matching the related RFC. Release note: None
This commit adds the `Inaccessible` boolean field to column descriptors. An inaccessible column does not appear in star expansion and cannot be referenced in queries. It cannot be viewed when inspecting a table via SHOW CREATE TABLE and is not shown as an attribute of its table in pg_catalog.pg_attribute. A column cannot be both hidden and inaccessible. Inaccessible columns will be used for implementing expression indexes. Release note: None
Expression indexes (a work-in-progress) now use inaccessible virtual computed columns to represent expressions. This commit also makes some expression index related errors more clear. Release note: None
This commit improves error messages caused by mistyped expressions in expression indexes. Release note: None
9873fc4 to
edcff6b
Compare
|
When I get the build to pass, I'll go ahead and merge this for now. I will address any late comments in future PRs—there's still a lot of little details I'll continue to work on to get expression indexes working as expected. |
This commit adds the inverted index documentation URL to errors related to inverted indexes. Previously, the error message directed the user to the documentation without a URL, putting the burden on them to track down the correct webpage. Release note: None
The `TableDescriptor` interface now includes an `AccessibleColumns` method that returns all public column with `Inaccessible=false`. This method simplifies code that needs to access only accessible columns. Release note: None
edcff6b to
dd3019c
Compare
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 33 of 33 files at r15, 11 of 11 files at r16, 11 of 11 files at r17, 6 of 6 files at r18, 5 of 5 files at r19, 5 of 6 files at r20, 7 of 7 files at r21, 6 of 6 files at r22.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
pkg/sql/catalog/tabledesc/table_desc.go, line 362 at r14 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Good point. I don't think it is desirable. The only place this function is used is in the context of
IMPORTstatements, as far as I can tell. In those cases, it is used to determine which columns to write to. Inaccessible columns are all computed columns (at least for now), so it is invalid to write to them directly. I've updated the function so that it only returns hidden=false and inaccessible=false columns.Following these code paths allowed to me to discover a bug when
IMPORT-ing into a table with virtual computed columns: #66694. I'm glad you drew our attention to this!
Nice!
|
TFTRs! bors r+ |
|
bors r+ |
|
Already running a review |
|
Build succeeded: |
sql: rename expression-based indexes to expression indexes
This commit replaces the wordy references to "expression-based indexes"
with "expression indexes", matching the related RFC.
Release note: None
sql: add inaccessible columns
This commit adds the
Inaccessibleboolean field to column descriptors.An inaccessible column does not appear in star expansion and cannot be
referenced in queries. It cannot be viewed when inspecting a table via
SHOW CREATE TABLE and is not shown as an attribute of its table in
pg_catalog.pg_attribute. A column cannot be both hidden and
inaccessible.
Inaccessible columns will be used for implementing expression indexes.
Release note: None
sql: use inaccessible columns for expression indexes
Expression indexes (a work-in-progress) now use inaccessible virtual
computed columns to represent expressions. This commit also makes some
expression index related errors more clear.
Release note: None
sql: improve type-related expression index errors
This commit improves error messages caused by mistyped expressions in
expression indexes.
Release note: None
sql: add docs URL to inverted index errors
This commit adds the inverted index documentation URL to errors related
to inverted indexes. Previously, the error message directed the user to
the documentation without a URL, putting the burden on them to track
down the correct webpage.
Release note: None
sql: add AccessibleColumns method to TableDescriptor interface
The
TableDescriptorinterface now includes anAccessibleColumnsmethod that returns all public column with
Inaccessible=false. Thismethod simplifies code that needs to access only accessible columns.
Release note: None