opt: add INSERT ON CONFLICT WHERE syntax for selecting arbiter indexes#53067
Conversation
f591a27 to
3a6cc19
Compare
mgartner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained
pkg/sql/opt/optbuilder/partial_index.go, line 26 at r1 (raw file):
// implication logic. See Factory.NormalizePartialIndexPredicate for more // details. func (b *Builder) buildPartialIndexPredicate(tableScope *scope, expr tree.Expr) memo.FiltersExpr {
I'm open to any better ideas on where to put this function. optbuilder/scalar.go didn't seem to be the best place, and neither did optbuilder/util.go, so I made a new file. 🤷
3601872 to
1149ccd
Compare
1149ccd to
d921d95
Compare
RaduBerinde
left a comment
There was a problem hiding this comment.
The release note should mention the DO UPDATE change. I also added a docs-todo label so the docs can be updated.
Maybe for a follow-up, but it would be nice if we kept the list of arbiter indexes in the private and passed it to the exec factory, so we can show it opt tests and in EXPLAIN.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @mgartner)
pkg/sql/opt/optbuilder/insert.go, line 698 at r2 (raw file):
// If conflict columns were explicitly specified, then only check for // conflicts on arbiter indexes. if !conflictOrds.Empty() && !arbiterIndexes.Contains(idx) {
This code would be simplified if had arbiterIndexes return the set of all unique indexes if conflictOrds is empty.
pkg/sql/opt/optbuilder/insert.go, line 1185 at r2 (raw file):
// or pseudo-partial index guarantee uniqueness of their columns across all // rows. func (mb *mutationBuilder) arbiterIndexes(
Very nice code, clean and well documented!
pkg/sql/opt/optbuilder/insert.go, line 1218 at r2 (raw file):
arbiters = util.FastIntSet{} arbiters.Add(idx) return arbiters
[nit] return util.MakeFastIntSet(idx)
pkg/sql/opt/optbuilder/insert.go, line 1221 at r2 (raw file):
} // Initialize tableScope once and only if needed.
Are we missing a tableScope == nil here?
pkg/sql/opt/optbuilder/testdata/upsert, line 2180 at r2 (raw file):
└── projections └── column3:7 = 'foo' [as=column17:17]
[nit] I think we should have more tests here, at least for the case where a non-partial unique index is chosen as the sole arbiter, and for a DO UPDATE variant.
This commit adds support for selecting unique partial indexes as "arbiters" for `INSERT ON CONFLICT` statements. An arbiter index is a unique index that is used to detect conflicts in which to perform the `ON CONFLICT` action on. Unique partial indexes can only be used as arbiters if the arbiter predicate in the `ON CONFLICT` statement implies the partial index's predicate. To avoid parsing the same partial index predicates multiple times, this commit also adds a `parsedIndexExprs` cache to `mutationBuilder`, similar to the cache it has for default and computed column expressions. Finally, this commit also makes the parsing of `ON CONFLICT DO UPDATE` more strict. It now requires a list of conflict columns. This matches Postgres behavior. From https://www.postgresql.org/docs/12/sql-insert.html: > For ON CONFLICT DO UPDATE, a conflict_target must be provided. In Postgres, the `conflict_target` can be a list of columns or a constraint name. Cockroach currently only supports a list of column names as a conflict target, so the list of column names preceeding `DO UPDATE` is now required by the parser. Release note (sql change): An `INSERT ... ON CONFLICT DO UPDATE` statement without a list of column names after `ON CONFLICT` now results in a SQL syntax error with the error code 42601. Previously it errored with the message "there is no unqiue or exlcusion constraint matching the ON CONFLICT specification" and the error code 42P10.
d921d95 to
9f2f8f6
Compare
mgartner
left a comment
There was a problem hiding this comment.
I've updated the release note.
Good idea re: showing the arbiters in opt tests and EXPLAIN. I'll do that in a follow-up.
TFTR!
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde)
pkg/sql/opt/optbuilder/insert.go, line 698 at r2 (raw file):
Previously, RaduBerinde wrote…
This code would be simplified if had
arbiterIndexesreturn the set of all unique indexes ifconflictOrdsis empty.
Great idea. Done.
pkg/sql/opt/optbuilder/insert.go, line 1185 at r2 (raw file):
Previously, RaduBerinde wrote…
Very nice code, clean and well documented!
Thank you, sir!
pkg/sql/opt/optbuilder/insert.go, line 1218 at r2 (raw file):
Previously, RaduBerinde wrote…
[nit] return
util.MakeFastIntSet(idx)
Done.
pkg/sql/opt/optbuilder/insert.go, line 1221 at r2 (raw file):
Previously, RaduBerinde wrote…
Are we missing a
tableScope == nilhere?
Ya this was the wrong conditional entirely. Looks like it worked anyway. Fixed.
pkg/sql/opt/optbuilder/testdata/upsert, line 2180 at r2 (raw file):
Previously, RaduBerinde wrote…
[nit] I think we should have more tests here, at least for the case where a non-partial unique index is chosen as the sole arbiter, and for a DO UPDATE variant.
I've added the case for the non-partial index chosen as the sole arbiter.
I'll leave DO UPDATE variants for when I add support for them, if that's ok. They use a different code path, buildInputForUpsert instead of buildInputForDoNothing. buildInputForUpsert doesn't yet use the arbiterIndexes function.
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 5 of 9 files at r1, 7 of 9 files at r2, 2 of 2 files at r3.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @RaduBerinde)
pkg/sql/opt/optbuilder/testdata/upsert, line 2281 at r3 (raw file):
---- # If there is a non-partial unique index, then it is the only arbiter.
Is there a way to tell from the plan which index is the arbiter? (is this what Radu was talking about above?)
pkg/sql/opt/optbuilder/testdata/upsert, line 2306 at r3 (raw file):
│ │ │ │ ├── columns: column1:5!null column2:6!null column3:7!null │ │ │ │ └── (1, 1, 'bar') │ │ │ ├── scan unique_partial_indexes
should this be @u3?
mgartner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde and @rytaft)
pkg/sql/opt/optbuilder/testdata/upsert, line 2281 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Is there a way to tell from the plan which index is the arbiter? (is this what Radu was talking about above?)
Yes that's what he was talking about. It's coming in this future PR: #53172
pkg/sql/opt/optbuilder/testdata/upsert, line 2306 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
should this be
@u3?
optbuilder does not currently build index-specific scans or force an index for detecting UPSERT or INSERT ON CONFLICT collisions. So for the build directive, scans are always full table scans on the primary index. We rely on transformation rules to convert this into a constrained scan on the secondary index. In theory, it could used another index to detect collisions that may be less costly (though in practice I doubt that it would).
I thought about having a way to unsure that the unique indexes are used for these scans, given that UPSERT/INSERT ON CONFLICT performance would be abysmal if it wasn't. But if the optimizer is choosing a full table scan over a constrained scan then there's likely other problems as well...
|
bors r+ |
rytaft
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde)
pkg/sql/opt/optbuilder/testdata/upsert, line 2306 at r3 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
optbuilderdoes not currently build index-specific scans or force an index for detecting UPSERT or INSERT ON CONFLICT collisions. So for thebuilddirective, scans are always full table scans on the primary index. We rely on transformation rules to convert this into a constrained scan on the secondary index. In theory, it could used another index to detect collisions that may be less costly (though in practice I doubt that it would).I thought about having a way to unsure that the unique indexes are used for these scans, given that UPSERT/INSERT ON CONFLICT performance would be abysmal if it wasn't. But if the optimizer is choosing a full table scan over a constrained scan then there's likely other problems as well...
ah right -- thanks for the explanation
|
Build failed (retrying...): |
53067: opt: add INSERT ON CONFLICT WHERE syntax for selecting arbiter indexes r=mgartner a=mgartner This commit adds support for selecting unique partial indexes as "arbiters" for `INSERT ON CONFLICT` statements. An arbiter index is a unique index that is used to detect conflicts in which to perform the `ON CONFLICT` action on. Unique partial indexes can only be used as arbiters if the arbiter predicate in the `ON CONFLICT` statement implies the partial index's predicate. To avoid parsing the same partial index predicates multiple times, this commit also adds a `parsedIndexExprs` cache to `mutationBuilder`, similar to the cache it has for default and computed column expressions. Finally, this commit also makes the parsing of `ON CONFLICT DO UPDATE` more strict. It now requires a list of conflict columns. This matches Postgres behavior. From https://www.postgresql.org/docs/12/sql-insert.html: > For ON CONFLICT DO UPDATE, a conflict_target must be provided. In Postgres, the `conflict_target` can be a list of columns or a constraint name. Cockroach currently only supports a list of column names as a conflict target, so the list of column names preceeding `DO UPDATE` is now required by the parser. Release note (sql change): An `INSERT ... ON CONFLICT DO UPDATE` statement without a list of column names after `ON CONFLICT` now results in a SQL syntax error with the error code 42601. Previously it errored with the message "there is no unqiue or exlcusion constraint matching the ON CONFLICT specification" and the error code 42P10. 53126: sql: drop schemas when dropping parent database r=rohany a=rohany Fixes #52362. This commit updates `DROP DATABASE CASCADE` to also clean up any child user defined schemas. Release note: None Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com> Co-authored-by: Rohan Yadav <rohany@alumni.cmu.edu>
|
Build failed (retrying...): |
|
Build succeeded: |
This commit adds support for selecting unique partial indexes as
"arbiters" for
INSERT ON CONFLICTstatements. An arbiter index is aunique index that is used to detect conflicts in which to perform the
ON CONFLICTaction on.Unique partial indexes can only be used as arbiters if the arbiter
predicate in the
ON CONFLICTstatement implies the partial index'spredicate.
To avoid parsing the same partial index predicates multiple times, this
commit also adds a
parsedIndexExprscache tomutationBuilder,similar to the cache it has for default and computed column expressions.
Finally, this commit also makes the parsing of
ON CONFLICT DO UPDATEmore strict. It now requires a list of conflict columns. This matches
Postgres behavior.
From https://www.postgresql.org/docs/12/sql-insert.html:
In Postgres, the
conflict_targetcan be a list of columns or aconstraint name. Cockroach currently only supports a list of column
names as a conflict target, so the list of column names preceeding
DO UPDATEis now required by the parser.Release note (sql change): An
INSERT ... ON CONFLICT DO UPDATEstatement without a list of column names after
ON CONFLICTnow resultsin a SQL syntax error with the error code 42601. Previously it errored
with the message "there is no unqiue or exlcusion constraint matching
the ON CONFLICT specification" and the error code 42P10.