catalog,tabledesc: add & adopt Constraint subtypes#92289
catalog,tabledesc: add & adopt Constraint subtypes#92289craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
3e44552 to
6d6ba24
Compare
|
This is ready for review. It's a big one, which I apologize for. It's hard to break down this change into smaller parts. I'll add annotations to help the reviewing process. I advise reviewers to take a look at the interface definitions in |
| for k := range info { | ||
| ckBuilder.MarkNameInUse(k) | ||
| for _, c := range n.tableDesc.AllConstraints() { | ||
| ckBuilder.MarkNameInUse(c.GetName()) |
There was a problem hiding this comment.
This is one of these instances where we're more careful to not re-use a name. Things can get quite confusing otherwise when rolling back a schema change involving a name collision.
| "constraint %q in the middle of being added, try again later", t.Constraint) | ||
| case descpb.ConstraintValidity_Dropping: | ||
| return pgerror.Newf(pgcode.ObjectNotInPrerequisiteState, | ||
| "constraint %q in the middle of being dropped", t.Constraint) |
There was a problem hiding this comment.
This is one of these instances where the adding and dropping states are handled more uniformly, to no adverse effect from what I can tell.
| details, ok := info[string(t.Constraint)] | ||
| if !ok { | ||
| constraint, _ := n.tableDesc.FindConstraintWithName(string(t.Constraint)) | ||
| if constraint == nil { |
There was a problem hiding this comment.
So many calls to GetConstraintInfo were really about finding a constraint by name. Though, notably, GetConstraintInfo would not return the full set of constraints (no dropping unique indexes for instance). In this case here I don't think that mattered but I'm willing to bet this caused some subtle undetected bugs at some other GetConstraintInfo call sites.
| for _, uwoi := range tableDesc.EnforcedUniqueConstraintsWithoutIndex() { | ||
| if uwoi.Dropped() || !uwoi.CollectKeyColumnIDs().Contains(colToDrop.GetID()) { | ||
| continue | ||
| } |
There was a problem hiding this comment.
I think (but I'm not sure) that omitting dropped unique-without-indexes was a bug.
| tree.DropRestrict, | ||
| remainingIndexes) | ||
| withSearchForReplacement, | ||
| ) |
There was a problem hiding this comment.
See function definition for details but we no longer pass slices of descpb.UniqueConstraint interfaces, because we got rid of that interface.
|
|
||
| // GetName returns the constraint name. | ||
| GetName() string | ||
| } |
There was a problem hiding this comment.
This has been superseded by catalog.UniqueConstraint, in effect.
| // Only validate column families, constraints, and indexes if this is | ||
| // actually a table, not if it's just a view. | ||
| if desc.IsPhysicalTable() { | ||
| desc.validateConstraintNamesAndIDs(vea) |
There was a problem hiding this comment.
Moving this further ahead was necessary because we got rid of CheckUniqueConstraints, but the price to pay was lots of noise in validate_test.go.
| if index.IsHelpfulOriginIndex(fk.OriginColumnIDs) { | ||
| return notices | ||
| } | ||
| if index.IsPartial() || index.NumKeyColumns() == 0 { |
There was a problem hiding this comment.
index.NumKeyColumns() == 0 can legitimately happen for the primary index of a virtual table. Of course, such tables don't have primary indexes at all, so I'm not sure how much I like the fact that we pretend that we do, but this is not the time or place to deal with this.
| alter table const_tbl drop constraint "steve"; | ||
| alter table const_tbl add constraint "steve" CHECK (a > 100); | ||
| COMMIT; | ||
|
|
There was a problem hiding this comment.
I just don't think that this should succeed. Use another name.
pkg/sql/catalog/table_elements.go
Outdated
| // AsCheck returns the corresponding CheckConstraint if the | ||
| // mutation is on a check constraint, nil otherwise. |
There was a problem hiding this comment.
nit: Is the comment stale? Why do we say "...if the mutation is on a check constraint..."?
6d6ba24 to
8933bd8
Compare
There was a problem hiding this comment.
I suggest we always have a line like var _ catalog.ConstraintProvider = constraintBase{} to enforce implementation of an interface in compile time. Same is true for all other structs that we claim to have implemented some interface.
There was a problem hiding this comment.
Okay, I see you are doing this for the specific constraint type (like checkConstraint). Great!
There was a problem hiding this comment.
Thanks but yeah it's a good point. IIRC I try to have the assertions on the leaves of the type hierarchy.
There was a problem hiding this comment.
should we panic with assertionFailedf here instead of returning 0? Or it's the caller's responsibility to check whether return is 0 and handles it appropriately if so?
There was a problem hiding this comment.
Your comment made me look what we actually do in catalog.Index's implementation and I'd always assumed we did bound checks but it turns out that we don't.
I'll do what catalog.Index does and remove these checks. I remember that back when that was implemented, this matter had been discussed. Evidently the consensus that came out of that was that we preferred triggering a runtime panic when going out of bounds. I'm OK with that.
There was a problem hiding this comment.
Unfortunately, the legacy code did a bad job about the state machine of constraints while it's still on the mutation slice. I think it's better to check the constraint mutation's validity instead of the MutationDescriptor_State, so I suggest we instead just do return c.desc.validity != dropping. But this again iterates one of my comments above that maybe we can get rid of this method and just expose GetConstraintValidity.
There was a problem hiding this comment.
To illustrate what I mean above a bit more: consider adding a check constraint. When we enqueue the check constraint onto the mutation slice, its MutationDescriptor_State will be (unfortunately) DELETE_ONLY but its validity is VALIDATING, which should be considered as Enforced but this method now will incorrectly returns false.
There was a problem hiding this comment.
A constraint's validity and whether it's enforced are two different things:
- the validity concerns the data that was present before the constraint is added;
- a constraint being enforced or not concerns the data that appears after the constraint is added.
There was a problem hiding this comment.
In go, an interface is basically a pair of pointers, so it makes sense for the underlying type of what the interface is wrapping to also be a pointer type.
I'm not a huge fan of go's pointer magic in general, and hardly an expert.
There was a problem hiding this comment.
nit: the naming seems to suggest all desc.Checks are enforced checks and all mutations.checks are not enforced checks, which isn't the case, right?
A check on the mutation with VALIDATING validity is considered enforced, right?
There was a problem hiding this comment.
why do we check uwi.NumKeyColumns() additionally? In what scenario will we have a non-nil uwi but 0 key columns?
There was a problem hiding this comment.
Virtual tables. It's dumb, in those cases we should have GetPrimaryIndex() return nil, yet we don't. Yet more tech debt.
There was a problem hiding this comment.
I still feel like a comment block like this that summarizes all the constraints we put on this contrived table is helpful.
pkg/sql/catalog/descriptor.go
Outdated
There was a problem hiding this comment.
Why do you think those methods belong here? It seems to me more appropriately to define them just as function that takes in a (ck|fk|uwoi) and return some property of that input constraint. Do we need anything from the table to serve those questions?
There was a problem hiding this comment.
Good question. The table is required to map the column ID to a catalog.Column, and I really prefer using the latter instead of IDs whenever possible.
See the similar []Column-returning methods taking indexes. These can be quite powerful and removing this layer of indirection of the column ID makes for some more expressive code at the call sites. See informationSchemaConstraintColumnUsageTable` for an example. Arguably some of them are never called, but if they're there it's because they used to, and perhaps will be again.
Xiang-Gu
left a comment
There was a problem hiding this comment.
I looked at the changes made to the new constraint interfaces, their implementations, constructing constraintCache, mutationCache, and TableDescriptor interface. All are welcoming and pleasant changes.
The only concern I have the how we implement isEnforced. I'm convinced that it's convenient/helpful to have this method though.
These new interfaces help encapsulate the descriptor protobufs for table
constraints:
- CheckConstraint,
- ForeignKeyConstraint,
- UniqueWithoutIndexConstraint,
- UniqueWithIndexConstraint.
These are the leaves of the Constraint type tree, and Constraint is the
interface at the root of it. The changes in this commit mimic what was
done for Column & Index, which respectively wrap descpb.ColumnDescriptor
and descpb.IndexDescriptor.
These new constraint interfaces are adopted liberally throughout the
code base, but not completely: many references to descpb-protobufs still
need to be replaced.
This commit also removes the somewhat confusing concept of "active"
versus "inactive" constraint. Not only was it confusing to me, evidently
it was also confusing to other contributors considering how several mild
bugs and inconsistencies made their way into the code. This commit
replaces this concept with that of an "enforced" constraint:
- A constraint is enforced if it applies to data written to the table,
regardless of whether it has been validated on the data already in
the table prior to the constraint coming into existence.
The implementation is somewhat awkward for constraints in that the same
constraint descriptor can be featured twice inside
a descpb.TableDescriptor, in its active constraint slice (such as
Checks, etc.) and also in the Mutations slice. In these cases the
interface wraps the non-mutation constraint descriptor protobuf, with no
observed changes to the database's behavior.
This commit required alterations to the table descriptor's
post-deserialization changes. The change which assigns constraint IDs
to table descriptors which don't have themhad some subtle bugs which went
unnoticed until this commit forced a heavier reliance on these fields.
This commit also adds a new change which ensures that a check
constraint's ColumnIDs slice is populated based on the columns
referenced in its expression, replacing ancient code which predates the
concept of post-deserialization changes and which would compute this
lazily.
As a more general observation: it appears that the handling of adding
and dropping constraints following other schema changes was mildly
inconsistent and this commit tries to improve this. For instance, it was
possible to drop a column which had been set as NOT NULL in the same
transaction, but not do the same for any other kind of constraint. Also
it was possible to reuse the name of a dropping constraint, despite this
constraint still being enforced. The new behavior is more strict, yet
this should be barely noticeable by the user in most practical cases:
it's rare that one wants to add a constraint referencing a column and
also drop that column in the same transaction, for instance.
Fixes cockroachdb#91918.
Release note: None
3b1ecd9 to
98ec204
Compare
|
Thanks for taking a look. I've addressed or responded to your comments. |
Xiang-Gu
left a comment
There was a problem hiding this comment.
LGTM and thanks for doing this!
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @postamar)
pkg/sql/catalog/table_elements.go line 425 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Dropping constraints are enforced. At least, that's the case of unique indexes. You can see this for yourself by setting a pausepoint on the schema changer execution, dropping a unique index, and INSERTing two identical values.
Thanks, that's a TIL!
pkg/sql/catalog/table_elements.go line 474 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
Eh, it's already in
catalog.Index.
Thanks, another TIL: we can define uniqueness on a column with a predicate (which makes it a partial unique constraint) with or without a backing index
pkg/sql/catalog/tabledesc/constraint.go line 130 at r2 (raw file):
Previously, postamar (Marius Posta) wrote…
A constraint's validity and whether it's enforced are two different things:
- the validity concerns the data that was present before the constraint is added;
- a constraint being enforced or not concerns the data that appears after the constraint is added.
This is enlighting! I always thought when it comes to constraint, their mutation state is irrelevant anymore. Thank you! This also means I need to make a one-line change to our adding path when adding a constraint where after I enqueue a constraint to the mutation slice, I will need to fast-forward its state to WRITE_ONLY.
|
Thanks for the review! bors r+ |
|
Build failed: |
|
The previous merge failed due to a flaky test; merging again bors r+ |
|
Build failed (retrying...): |
|
Build succeeded: |
This test checks the functionality for the following sequence of events: 1. A txn adds a constraint to a column on a table. 2. A separate txn drops the column. However, this interaction between the two txns has been made explicit by the PR cockroachdb#92289. Since this PR, step (2) will fail if the constraint in step (1) is in the process of being added. As a result, the elaborate set up and sequence of events being tested in TestDropColumnAfterMutations is no longer necessary. Release note: none Epic: none Fixes: cockroachdb#76843
107474: cli/zip: emit SQL table data using TSV by default r=abarganier a=knz Fixes #107473. Epic: CRDB-28893 This is a partial revert of 35738d4. It changes the default value of the `--format` flag back from JSON to TSV. Release note (backward-incompatible change): THIS RELEASE NOTE CANCELS THE CORRESPONDING PREVIOUS BACKWARD-INCOMPATIBLE CHANGE. New behavior, compatible with previous versions of CockroachDB: the command `cockroach debug zip` stores data retrieved from SQL tables in the remote cluster using the TSV format by default. Release note (cli change): The default value of the `--format` parameter to `cockroach debug zip` is `tsv`, like other CLI commands that can extract SQL data. 107489: sql: Delete invalid TestDropColumnAfterMutations test r=rafiss a=rimadeodhar This test checks the functionality for the following sequence of events: 1. A txn adds a constraint to a column on a table. 2. A separate txn drops the column. However, this interaction between the two txns has been made explicit by the PR #92289. Since this PR, step (2) will fail if the constraint in step (1) is in the process of being added. As a result, the elaborate set up and sequence of events being tested in TestDropColumnAfterMutations is no longer necessary. Release note: none Epic: none Fixes: #76843 107664: authors: add Angela Dietz to authors r=angeladietz a=angeladietz Release note: None Epic: None 107666: server: fix a race in tenant creation r=knz a=lidorcarmel Previously, scanTenantsForRunnableServices() was not holding the mutex when SELECTing for the existing tenant names, which means that the following may happen: - scanTenantsForRunnableServices() sees that only the system tenant exists - createServerEntryLocked() then adds another tenant while holding the mutex - scanTenantsForRunnableServices() takes the lock and stops the tenant that was just created because only the system tenant should be alive (which is wrong) This patch changes scanTenantsForRunnableServices() to take the mutex before SELECTing for the existing tenants in order to avoid the race. Epic: none Fixes: #107434 Fixes: #107343 Fixes: #107154 Release note: None 107673: opt: remove Metadata.AllUserDefinedFunctions r=mgartner a=mgartner The metadata method `AllUserDefinedFunctions` has been replaced with a new function `HasUserDefinedFunctions` which provides a simpler API without exposing the underlying UDF dependency map. The map is still available outside of the opt package via the `TestingUDFDeps` method which is designed for testing use only. Epic: None Release note: None 107714: roachtest: add warning to redacted github issue r=mgartner a=mgartner Epic: None Release note: None 107716: ui: extend search logic on insights page r=koorosh a=koorosh This change extends the number of fields where search is applied (instead of single transaction/ statement execution ID field). It makes possible to search for any available ID in Txn or statement insight. Release note (ui change): search is performed on all ID fields of transaction and statement insights. Resolves: #107253 Demo: https://github.com/cockroachdb/cockroach/assets/3106437/7fb56720-3ab2-4be4-9500-457707f6f01d Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net> Co-authored-by: rimadeodhar <rima@cockroachlabs.com> Co-authored-by: Angela Dietz <dietz@cockroachlabs.com> Co-authored-by: Lidor Carmel <lidor@cockroachlabs.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com> Co-authored-by: Andrii Vorobiov <and.vorobiov@gmail.com>
This test checks the functionality for the following sequence of events: 1. A txn adds a constraint to a column on a table. 2. A separate txn drops the column. However, this interaction between the two txns has been made explicit by the PR #92289. Since this PR, step (2) will fail if the constraint in step (1) is in the process of being added. As a result, the elaborate set up and sequence of events being tested in TestDropColumnAfterMutations is no longer necessary. Release note: none Epic: none Fixes: #76843
These new interfaces help encapsulate the descriptor protobufs for table
constraints:
These are the leaves of the Constraint type tree, and Constraint is the
interface at the root of it. The changes in this commit mimic what was
done for Column & Index, which respectively wrap descpb.ColumnDescriptor
and descpb.IndexDescriptor.
These new constraint interfaces are adopted liberally throughout the
code base, but not completely: many references to descpb-protobufs still
need to be replaced.
This commit also removes the somewhat confusing concept of "active"
versus "inactive" constraint. Not only was it confusing to me, evidently
it was also confusing to other contributors considering how several mild
bugs and inconsistencies made their way into the code. This commit
replaces this concept with that of an "enforced" constraint:
regardless of whether it has been validated on the data already in
the table prior to the constraint coming into existence.
The implementation is somewhat awkward for constraints in that the same
constraint descriptor can be featured twice inside
a descpb.TableDescriptor, in its active constraint slice (such as
Checks, etc.) and also in the Mutations slice. In these cases the
interface wraps the non-mutation constraint descriptor protobuf, with no
observed changes to the database's behavior.
This commit required alterations to the table descriptor's
post-deserialization changes. The change which assigns constraint IDs
to table descriptors which don't have themhad some subtle bugs which went
unnoticed until this commit forced a heavier reliance on these fields.
This commit also adds a new change which ensures that a check
constraint's ColumnIDs slice is populated based on the columns
referenced in its expression, replacing ancient code which predates the
concept of post-deserialization changes and which would compute this
lazily.
As a more general observation: it appears that the handling of adding
and dropping constraints following other schema changes was mildly
inconsistent and this commit tries to improve this. For instance, it was
possible to drop a column which had been set as NOT NULL in the same
transaction, but not do the same for any other kind of constraint. Also
it was possible to reuse the name of a dropping constraint, despite this
constraint still being enforced. The new behavior is more strict, yet
this should be barely noticeable by the user in most practical cases:
it's rare that one wants to add a constraint referencing a column and
also drop that column in the same transaction, for instance.
Fixes #91918.
Release note: None