sql: primary key to allow virtual columns#73928
sql: primary key to allow virtual columns#73928craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
chengxiong-ruan
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)
pkg/sql/logictest/testdata/logic_test/alter_primary_key, line 498 at r1 (raw file):
ALTER TABLE t ALTER PRIMARY KEY USING COLUMNS (y) statement error pgcode XX000 ...
This looks awkward to me on first sight. We used to return error early before we write schema change where a AssertionError is returned. Now since we remove the validation, user would see a trace for the error plus some descriptor information: http://github.com/cockroachdb/cockroach/blob/master/pkg/sql/table.go#L287-L287 Not sure why we wanted to assert in this situation, but I think it's legit to error out with trace since invalid descriptor is a big deal...
pkg/sql/logictest/testdata/logic_test/alter_primary_key, line 1334 at r1 (raw file):
b INT8 NOT NULL, k INT8 NOT NULL AS (a + b) VIRTUAL, CONSTRAINT t_pkey PRIMARY KEY (a ASC),
One possible small improvement for alter primary key. When we alter primary key, we create a unique index to match the old primary key. However, it doest that no matter if there's already a same unique key exists or not. We make add some deduplication here. Though I don't think this is a must have. Let me know if it make sense to create an issue for it.
Code quote:
CONSTRAINT t_pkey PRIMARY KEY (a ASC),
UNIQUE INDEX t_b_k_key (b ASC, k ASC),
UNIQUE INDEX t_k_key (k ASC),
UNIQUE INDEX t_a_key (a ASC),
INDEX t_idx_b_k (b ASC, k ASC),
ajwerner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan and @postamar)
pkg/sql/logictest/testdata/logic_test/alter_primary_key, line 498 at r1 (raw file):
Previously, chengxiong-ruan (Chengxiong Ruan) wrote…
This looks awkward to me on first sight. We used to return error early before we write schema change where a
AssertionErroris returned. Now since we remove the validation, user would see a trace for the error plus some descriptor information: http://github.com/cockroachdb/cockroach/blob/master/pkg/sql/table.go#L287-L287 Not sure why we wanted to assert in this situation, but I think it's legit to error out with trace since invalid descriptor is a big deal...
I don't follow how or why we broke this. We should definitely retain this behavior. It's rather important.
chengxiong-ruan
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)
pkg/sql/logictest/testdata/logic_test/alter_primary_key, line 498 at r1 (raw file):
Previously, ajwerner wrote…
I don't follow how or why we broke this. We should definitely retain this behavior. It's rather important.
it used to return one single error message, because we just return the validation error message from AllocateIDs.
now we remove the validation in AllocateIDs, we validate when writing schema change, and it error with a more severe type AssertionError and added table descriptor data in the error message as well. so it would look like the very loud error message below. While I can match the whole block of strings with wildcard, logic test told me that serious error with code "XX000" occurred; if expected, must use 'error pgcode XX000 ...'. Let me know if I did any better explaining this...
pq: internal error: table descriptor is not valid
tabledesc.Mutable: {ID: 73, Version: 2, IsUncommitted: true, ModificationTime: "0,0", ParentID: 54, ParentSchemaID: 55, State: PUBLIC, NextColumnID: 3, Columns: [{ID: 1, TypeID: 20, Null: false}, {ID: 2, TypeID: 20, Null: false}], NextFamilyID: 1, Families: [{ID: 0, Columns: [1, 2]}], MutationJobs: [{MutationID: 1, JobID: 719608795391819777}], Mutations: [{MutationID: 1, Direction: ADD, State: DELETE_ONLY, Index: {ID: 2, Unique: true, KeyColumns: [{ID: 2, Dir: ASC}], StoreColumns: [1], State: ADD, MutationID: 1}}, {MutationID: 1, Direction: ADD, State: DELETE_ONLY, Index: {ID: 3, Unique: true, KeyColumns: [{ID: 1, Dir: ASC}], KeySuffixColumns: [2], State: ADD, MutationID: 1}}, {MutationID: 1, Direction: ADD, State: DELETE_ONLY, PrimaryKeySwap: {OldPrimaryIndexID: 1, OldIndexes: [], NewPrimaryIndexID: 2, NewIndexes: []}}, {MutationID: 1, Direction: ADD, State: DELETE_ONLY, Index: {ID: 4, Unique: false, KeyColumns: [{ID: 2, Dir: ASC}], KeySuffixColumns: [1], State: ADD, MutationID: 1}}], PrimaryIndex: 1, NextIndexID: 5, Indexes: [{ID: 1, Unique: true, KeyColumns: [{ID: 1, Dir: ASC}], StoreColumns: [2]}]}
: relation "t" (73): unimplemented: cannot perform other schema changes in the same transaction as a primary key change
HINT: You have encountered an unexpected error.
Please check the public issue tracker to check whether this problem is
already tracked. If you cannot find it there, please report the error
with details by creating a new issue.
If you would rather not post publicly, please contact us directly
using the support form.
We appreciate your feedback.
DETAIL: stack trace:
github.com/cockroachdb/cockroach/pkg/sql/table.go:287: writeTableDescToBatch()
github.com/cockroachdb/cockroach/pkg/sql/table.go:265: writeTableDesc()
github.com/cockroachdb/cockroach/pkg/sql/table.go:235: writeSchemaChange()
github.com/cockroachdb/cockroach/pkg/sql/create_index.go:686: startExec()
github.com/cockroachdb/cockroach/pkg/sql/plan.go:515: func2()
github.com/cockroachdb/cockroach/pkg/sql/walk.go:112: func1()
github.com/cockroachdb/cockroach/pkg/sql/walk.go:295: visitInternal()
github.com/cockroachdb/cockroach/pkg/sql/walk.go:79: visit()
github.com/cockroachdb/cockroach/pkg/sql/walk.go:43: walkPlan()
github.com/cockroachdb/cockroach/pkg/sql/plan.go:518: startExec()
github.com/cockroachdb/cockroach/pkg/sql/plan_node_to_row_source.go:146: Start()
github.com/cockroachdb/cockroach/pkg/sql/colexec/columnarizer.go:157: Init()
github.com/cockroachdb/cockroach/pkg/sql/colexec/invariants_checker.go:68: Init()
github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:87: Init()
github.com/cockroachdb/cockroach/pkg/sql/colflow/flow_coordinator.go:235: func1()
github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:91: CatchVectorizedRuntimeError()
github.com/cockroachdb/cockroach/pkg/sql/colflow/flow_coordinator.go:234: init()
github.com/cockroachdb/cockroach/pkg/sql/colflow/flow_coordinator.go:275: Run()
github.com/cockroachdb/cockroach/pkg/sql/colflow/vectorized_flow.go:259: Run()
github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:553: Run()
github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:1368: PlanAndRun()
github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1385: execWithDistSQLEngine()
github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1062: dispatchToExecutionEngine()
github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:700: execStmtInOpenState()
github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:129: execStmt()
github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1748: func1()
github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1750: execCmd()
github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1672: run()
github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:679: ServeConn()
github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:666: func1()
chengxiong-ruan
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)
pkg/sql/logictest/testdata/logic_test/alter_primary_key, line 498 at r1 (raw file):
Previously, chengxiong-ruan (Chengxiong Ruan) wrote…
it used to return one single error message, because we just return the validation error message from
AllocateIDs.now we remove the validation in
AllocateIDs, we validate when writing schema change, and it error with a more severe typeAssertionErrorand added table descriptor data in the error message as well. so it would look like the very loud error message below. While I can match the whole block of strings with wildcard, logic test told me thatserious error with code "XX000" occurred; if expected, must use 'error pgcode XX000 ...'. Let me know if I did any better explaining this...pq: internal error: table descriptor is not valid tabledesc.Mutable: {ID: 73, Version: 2, IsUncommitted: true, ModificationTime: "0,0", ParentID: 54, ParentSchemaID: 55, State: PUBLIC, NextColumnID: 3, Columns: [{ID: 1, TypeID: 20, Null: false}, {ID: 2, TypeID: 20, Null: false}], NextFamilyID: 1, Families: [{ID: 0, Columns: [1, 2]}], MutationJobs: [{MutationID: 1, JobID: 719608795391819777}], Mutations: [{MutationID: 1, Direction: ADD, State: DELETE_ONLY, Index: {ID: 2, Unique: true, KeyColumns: [{ID: 2, Dir: ASC}], StoreColumns: [1], State: ADD, MutationID: 1}}, {MutationID: 1, Direction: ADD, State: DELETE_ONLY, Index: {ID: 3, Unique: true, KeyColumns: [{ID: 1, Dir: ASC}], KeySuffixColumns: [2], State: ADD, MutationID: 1}}, {MutationID: 1, Direction: ADD, State: DELETE_ONLY, PrimaryKeySwap: {OldPrimaryIndexID: 1, OldIndexes: [], NewPrimaryIndexID: 2, NewIndexes: []}}, {MutationID: 1, Direction: ADD, State: DELETE_ONLY, Index: {ID: 4, Unique: false, KeyColumns: [{ID: 2, Dir: ASC}], KeySuffixColumns: [1], State: ADD, MutationID: 1}}], PrimaryIndex: 1, NextIndexID: 5, Indexes: [{ID: 1, Unique: true, KeyColumns: [{ID: 1, Dir: ASC}], StoreColumns: [2]}]} : relation "t" (73): unimplemented: cannot perform other schema changes in the same transaction as a primary key change HINT: You have encountered an unexpected error. Please check the public issue tracker to check whether this problem is already tracked. If you cannot find it there, please report the error with details by creating a new issue. If you would rather not post publicly, please contact us directly using the support form. We appreciate your feedback. DETAIL: stack trace: github.com/cockroachdb/cockroach/pkg/sql/table.go:287: writeTableDescToBatch() github.com/cockroachdb/cockroach/pkg/sql/table.go:265: writeTableDesc() github.com/cockroachdb/cockroach/pkg/sql/table.go:235: writeSchemaChange() github.com/cockroachdb/cockroach/pkg/sql/create_index.go:686: startExec() github.com/cockroachdb/cockroach/pkg/sql/plan.go:515: func2() github.com/cockroachdb/cockroach/pkg/sql/walk.go:112: func1() github.com/cockroachdb/cockroach/pkg/sql/walk.go:295: visitInternal() github.com/cockroachdb/cockroach/pkg/sql/walk.go:79: visit() github.com/cockroachdb/cockroach/pkg/sql/walk.go:43: walkPlan() github.com/cockroachdb/cockroach/pkg/sql/plan.go:518: startExec() github.com/cockroachdb/cockroach/pkg/sql/plan_node_to_row_source.go:146: Start() github.com/cockroachdb/cockroach/pkg/sql/colexec/columnarizer.go:157: Init() github.com/cockroachdb/cockroach/pkg/sql/colexec/invariants_checker.go:68: Init() github.com/cockroachdb/cockroach/pkg/sql/colflow/stats.go:87: Init() github.com/cockroachdb/cockroach/pkg/sql/colflow/flow_coordinator.go:235: func1() github.com/cockroachdb/cockroach/pkg/sql/colexecerror/error.go:91: CatchVectorizedRuntimeError() github.com/cockroachdb/cockroach/pkg/sql/colflow/flow_coordinator.go:234: init() github.com/cockroachdb/cockroach/pkg/sql/colflow/flow_coordinator.go:275: Run() github.com/cockroachdb/cockroach/pkg/sql/colflow/vectorized_flow.go:259: Run() github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:553: Run() github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:1368: PlanAndRun() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1385: execWithDistSQLEngine() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:1062: dispatchToExecutionEngine() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:700: execStmtInOpenState() github.com/cockroachdb/cockroach/pkg/sql/conn_executor_exec.go:129: execStmt() github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1748: func1() github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1750: execCmd() github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:1672: run() github.com/cockroachdb/cockroach/pkg/sql/conn_executor.go:679: ServeConn() github.com/cockroachdb/cockroach/pkg/sql/pgwire/conn.go:666: func1()
My bad, just realized that I can match with something like error pgcode XX000 (the error message regex here)
ajwerner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan and @postamar)
pkg/sql/logictest/testdata/logic_test/alter_primary_key, line 498 at r1 (raw file):
Previously, chengxiong-ruan (Chengxiong Ruan) wrote…
My bad, just realized that I can match with something like
error pgcode XX000 (the error message regex here)
The validation code was the earliest attempt to not just validate structure but also input. That turned out to be a rather bad move. As a principle, any time an internal error bubbles up to the client, it should be treated as a bug. I think the right answer here is to detect the invalid condition earlier and explicitly before attempting to modify the descriptor. Please add code to do that so that the error does not change. The end result will be cleaner and more explicit.
cfd35cd to
f601eae
Compare
chengxiong-ruan
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)
pkg/sql/logictest/testdata/logic_test/alter_primary_key, line 498 at r1 (raw file):
Previously, ajwerner wrote…
The validation code was the earliest attempt to not just validate structure but also input. That turned out to be a rather bad move. As a principle, any time an internal error bubbles up to the client, it should be treated as a bug. I think the right answer here is to detect the invalid condition earlier and explicitly before attempting to modify the descriptor. Please add code to do that so that the error does not change. The end result will be cleaner and more explicit.
As discussed offline. made a non-validating version of AllocateIDs
abd571f to
c40166d
Compare
ajwerner
left a comment
There was a problem hiding this comment.
nice
Reviewed 3 of 5 files at r2, 2 of 2 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, and @postamar)
pkg/sql/alter_primary_key.go, line 369 at r3 (raw file):
colIDs := idx.CollectKeyColumnIDs() if !idx.Primary() { colIDs.UnionWith(idx.CollectSecondaryStoredColumnIDs())
I'll be honest, I don't understand this line
pkg/sql/alter_primary_key.go, line 390 at r3 (raw file):
} if !idx.Primary() {
consider using commentary to describe this logic and its motivation.
pkg/sql/alter_primary_key.go, line 392 at r3 (raw file):
if !idx.Primary() { newPrimaryKeyColIDs := catalog.MakeTableColSet(newPrimaryIndexDesc.KeyColumnIDs...) for _, colID := range idx.CollectKeySuffixColumnIDs().Ordered() {
I think the idiom here is:
for i := 0; i < idx.NumKeySuffixColumns(); i++ {
colID := idx.GetKeySuffixColumnID(i)pkg/sql/catalog/tabledesc/validate.go, line 865 at r3 (raw file):
// Ensure that indexes do not STORE virtual columns as suffix columns unless // they are primary key columns or future primary key columns (when `ALTER // PRIMARY KEY` is executed and a primary key mutation exists)
nit: period.
pkg/sql/catalog/tabledesc/validate.go, line 872 at r3 (raw file):
for _, colID := range idx.IndexDesc().KeySuffixColumnIDs { if col := columnsByID[colID]; col != nil && col.IsVirtual() { return errors.Newf("index %q cannot store virtual column %d", idx.GetName(), col)
Why is it okay to skip columns which do not exist?
Code quote:
if _, ok := columnsByID[colID]; !ok {
continue
}pkg/sql/catalog/tabledesc/validate.go, line 885 at r3 (raw file):
isCurPKCol := curPKColIDs.Contains(colID) isNewPKCol := idx.Adding() && newPKColIDs.Contains(colID) if !curPKColIDs.Contains(colID) && !isCurPKCol && !isNewPKCol && col.IsVirtual() {
Something smells here. Consider the case where the index is adding and is not in the new primary key? what about:
if !col.IsVirtual() {
continue
}
if idx.Adding() && newPKColIDs.Contains(colID) ||
!idx.Adding() && curPKColIDs.Contains(colID) {
errors.Newf("index %q cannot have virtual column %q in its key suffix columns which is not a key column of the corresponding primary key", idx.GetName(), col.GetName())
}
Code quote:
isCurPKCol := curPKColIDs.Contains(colID)
isNewPKCol := idx.Adding() && newPKColIDs.Contains(colID)
if !curPKColIDs.Contains(colID) && !isCurPKCol && !isNewPKCol && col.IsVirtual() {pkg/sql/catalog/tabledesc/validate_test.go, line 1048 at r3 (raw file):
}, }}, {`index "sec" cannot store virtual column "c3"`,
Can you add some cases for your primary key swap situations and virtual columns?
pkg/sql/logictest/testdata/logic_test/alter_primary_key, line 1334 at r1 (raw file):
Previously, chengxiong-ruan (Chengxiong Ruan) wrote…
One possible small improvement for alter primary key. When we alter primary key, we create a unique index to match the old primary key. However, it doest that no matter if there's already a same unique key exists or not. We make add some deduplication here. Though I don't think this is a must have. Let me know if it make sense to create an issue for it.
Yeah, it's silly. We can improve that separately. Start with an issue.
c40166d to
39c807a
Compare
chengxiong-ruan
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @chengxiong-ruan, and @postamar)
pkg/sql/alter_primary_key.go, line 369 at r3 (raw file):
Previously, ajwerner wrote…
I'll be honest, I don't understand this line
Look like this line of code helps in situations like the following
create table t(
a int primary key,
b int not null,
c int,
index t_idx_c (c) storing (b)
);
-- index t_idx_c stores column b
-- then we alter primary key to use column b
alter table t alter primary key using columns (b);
-- index t_idx_c is not rewritten because it stores bturning on kv tracing helped figuring this out... thank you very much.
pkg/sql/alter_primary_key.go, line 390 at r3 (raw file):
Previously, ajwerner wrote…
consider using commentary to describe this logic and its motivation.
added comments
pkg/sql/alter_primary_key.go, line 392 at r3 (raw file):
Previously, ajwerner wrote…
I think the idiom here is:
for i := 0; i < idx.NumKeySuffixColumns(); i++ { colID := idx.GetKeySuffixColumnID(i)
good point, done
pkg/sql/catalog/tabledesc/validate.go, line 865 at r3 (raw file):
Previously, ajwerner wrote…
nit: period.
added period.
pkg/sql/catalog/tabledesc/validate.go, line 872 at r3 (raw file):
Previously, ajwerner wrote…
Why is it okay to skip columns which do not exist?
it is not ok, added err
Code quote:
!curPKColIDs.Contains(colID) && pkg/sql/catalog/tabledesc/validate.go, line 885 at r3 (raw file):
Previously, ajwerner wrote…
Something smells here. Consider the case where the index is adding and is not in the new primary key? what about:
if !col.IsVirtual() { continue } if idx.Adding() && newPKColIDs.Contains(colID) || !idx.Adding() && curPKColIDs.Contains(colID) { errors.Newf("index %q cannot have virtual column %q in its key suffix columns which is not a key column of the corresponding primary key", idx.GetName(), col.GetName()) }
good point. I split the logic into 2 situations len(newPKColIDs)==0 andlen(newPKColIDs) > 0 though we can merge them, I think it's easier to read and cause less brain power.
pkg/sql/catalog/tabledesc/validate_test.go, line 1048 at r3 (raw file):
Previously, ajwerner wrote…
Can you add some cases for your primary key swap situations and virtual columns?
added more test cases and make the test to accept success cases.
|
pkg/sql/alter_primary_key.go, line 369 at r3 (raw file): Previously, chengxiong-ruan (Chengxiong Ruan) wrote…
with kv tracing, you may noticed that column that |
|
pkg/sql/logictest/testdata/logic_test/alter_primary_key, line 1334 at r1 (raw file): Previously, ajwerner wrote…
created issue #74116 |
ajwerner
left a comment
There was a problem hiding this comment.
Reviewed 2 of 3 files at r4.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @chengxiong-ruan and @postamar)
pkg/sql/alter_primary_key.go, line 393 at r4 (raw file):
//column, we should rewrite the index as well because we don't allow //non-primary key virtual column in secondary index's suffix columns, and //this would be violated if we don't rewrite the index.
nit: space between // and the text
Code quote:
//If there is any virtual column of the secondary index not a primary key
//column, we should rewrite the index as well because we don't allow
//non-primary key virtual column in secondary index's suffix columns, and
//this would be violated if we don't rewrite the index.To use virtual column as shard column for hash index, we need to first allow primary key to accept virtual columns. This pr loosen the constraint on primary key, allow virtual columns in suffix cols of secondary index as long as they are in primary key. Also added extra logic to make sure secondary index is rewritten if it contains any virtual column not exists in new primary when altering primary key. Release not (sql change): Virtual columns are now allowed in primary keys.
39c807a to
b61644c
Compare
|
TFTR! bors r+ |
|
Build succeeded: |
To use virtual column as shard column for hash index, we need to
first allow primary key to accept virtual columns. This pr loosen
the constraint on primary key, allow virtual columns in suffix cols
of secondary index as long as they are in primary key. Also added
extra logic to make sure secondary index is rewritten if it contains
any virtual column not exists in new primary when altering primary
key.
Release not (sql change): Virtual columns are now allowed in primary
keys.