opt: cast to identical types for set operations#60560
opt: cast to identical types for set operations#60560craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
72ea704 to
8795571
Compare
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewed 4 of 4 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @rytaft)
pkg/sql/opt/optbuilder/union.go, line 48 at r1 (raw file):
outScope = inScope.push() //// propagateTypesLeft/propagateTypesRight indicate whether we need to wrap
This seems like a leftover?
pkg/sql/opt/optbuilder/union.go, line 152 at r1 (raw file):
// Equivalent but not identical types. Use the type from the left and add // a cast on the right-hand side. // TODO(radu): perhaps we should do a best effort attempt to choose the
We're actually doing something similar here
cockroach/pkg/sql/colexec/mergejoiner.go
Lines 253 to 271 in fe919cc
Possibly we could extract that code and reuse it here to make the determination.
Postgres does support UNIONing an INT with a DECIMAL:
yuzefovich=# create table t (i int, d decimal);
CREATE TABLE
yuzefovich=# insert into t values (1, 1);
INSERT 0 1
yuzefovich=# select u, pg_typeof(u) from (select i from t union all select d from t) as t(u);
u | pg_typeof
---+-----------
1 | numeric
1 | numeric
(2 rows)
Also, the postgres upcasts:
yuzefovich=# create table t (a int4, b int8);
CREATE TABLE
yuzefovich=# insert into t values (1, 1);
INSERT 0 1
yuzefovich=# select u, pg_typeof(u) from (select a from t union all select b from t) as t(u);
u | pg_typeof
---+-----------
1 | bigint
1 | bigint
(2 rows)
yuzefovich=# select u, pg_typeof(u) from (select b from t union all select a from t) as t(u);
u | pg_typeof
---+-----------
1 | bigint
1 | bigint
(2 rows)
I think it might be worth making us resemble Postgres here since we're touching the code.
rytaft
left a comment
There was a problem hiding this comment.
I don't have anything to add beyond the comments from @yuzefovich. I can take another look once you've addressed those. Otherwise looks good.
Reviewed 4 of 4 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde)
RaduBerinde
left a comment
There was a problem hiding this comment.
Updated. We now use the larger (according to Width()) type when possible, and support a few numeric upcasts.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @yuzefovich)
pkg/sql/opt/optbuilder/union.go, line 48 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
This seems like a leftover?
Done.
pkg/sql/opt/optbuilder/union.go, line 152 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
We're actually doing something similar here
cockroach/pkg/sql/colexec/mergejoiner.go
Lines 253 to 271 in fe919cc
Possibly we could extract that code and reuse it here to make the determination.
Postgres does support UNIONing an INT with a DECIMAL:
yuzefovich=# create table t (i int, d decimal); CREATE TABLE yuzefovich=# insert into t values (1, 1); INSERT 0 1 yuzefovich=# select u, pg_typeof(u) from (select i from t union all select d from t) as t(u); u | pg_typeof ---+----------- 1 | numeric 1 | numeric (2 rows)Also, the postgres upcasts:
yuzefovich=# create table t (a int4, b int8); CREATE TABLE yuzefovich=# insert into t values (1, 1); INSERT 0 1 yuzefovich=# select u, pg_typeof(u) from (select a from t union all select b from t) as t(u); u | pg_typeof ---+----------- 1 | bigint 1 | bigint (2 rows) yuzefovich=# select u, pg_typeof(u) from (select b from t union all select a from t) as t(u); u | pg_typeof ---+----------- 1 | bigint 1 | bigint (2 rows)I think it might be worth making us resemble Postgres here since we're touching the code.
Done.
8795571 to
fed07d4
Compare
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewed 4 of 4 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)
pkg/sql/opt/optbuilder/union_test.go, line 1 at r2 (raw file):
// Copyright 2020 The Cockroach Authors.
nit: s/2020/2021/.
pkg/sql/opt/optbuilder/union_test.go, line 54 at r2 (raw file):
}, { // At the same scale, we use the left type.
nit: seems like we have different scales here.
fed07d4 to
c130f6a
Compare
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 3 of 4 files at r2, 1 of 1 files at r3.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde)
pkg/sql/opt/optbuilder/with.go, line 196 at r3 (raw file):
panic(pgerror.Newf( pgcode.DatatypeMismatch, "UNION types %s and %s cannot be matched",
[nit] it's possible that the types would be allowed in a normal UNION, just not in a recursive CTE, right? Might be worth mentioning the WITH RECURSIVE part if so.
c130f6a to
6195a7f
Compare
RaduBerinde
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft)
pkg/sql/opt/optbuilder/with.go, line 196 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
[nit] it's possible that the types would be allowed in a normal UNION, just not in a recursive CTE, right? Might be worth mentioning the
WITH RECURSIVEpart if so.
Done, and added a test.
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r4.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 2 stale)
6195a7f to
5e0341c
Compare
This change makes the optbuilder more strict when building set operations. Previously, it could build expressions which have corresponding left/right types which are `Equivalent()`, but not `Identical()`. This leads to errors in vectorized execution, when we e.g. try to union a INT8 with an INT4. We now make the types on both sides `Identical()`, adding casts as necessary. We try to do a best-effort attempt to use the larger numeric type when possible (e.g. int4->int8, int->float, float->decimal). Fixes cockroachdb#59148. Release note (bug fix): fixed execution errors for some queries that use set operations (UNION / EXCEPT / INTERSECT) where a column has types of different widths on the two sides (e.g. INT4 vs INT8).
5e0341c to
4037bd8
Compare
|
bors r+ |
|
Build succeeded: |
|
Actually, would it be easy to add casts for inputs to a join operator (I'm interested in the merge joiner in particular, but I'm guessing this should also happen during the optbuild stage, before we know which kind of joiner we will be using) so that equality columns are of the same type (in the same cases with numeric types of different widths or families)? This would allow us to get rid of the casts that are planned for the merge joiner in the vectorized engine I mentioned above. |
|
It would be pretty different - merge joins are generated during exploration. I think probably the xform rule or the execbuilder would be the best place for that. Can you file an issue (and include a pointer to the exec code for mergejoins)? |
|
I think we should backport this. We've seen a couple of sentry reports that could have been caused by this issue (e.g. #61023). |
In cockroachdb#60560, we made the matching of types in UNIONs more strict. In the recursive CTE code, we don't allow adding casts to the initial expression, so the change caused us to regress in terms of supported queries. This change fixes this by allowing casts to the initial expression. Not sure why I didn't allow this from the get-go. Release note (bug fix): fixed "types cannot be matched for WITH RECURSIVE" error in cases where we can cast the type in the initial expression.
62808: opt: allow casts in initial CTE expression r=RaduBerinde a=RaduBerinde In #60560, we made the matching of types in UNIONs more strict. In the recursive CTE code, we don't allow adding casts to the initial expression, so the change caused us to regress in terms of supported queries. This change fixes this by allowing casts to the initial expression. Not sure why I didn't allow this from the get-go. Release note (bug fix): fixed "types cannot be matched for WITH RECURSIVE" error in cases where we can cast the type in the initial expression. Co-authored-by: Radu Berinde <radu@cockroachlabs.com>
In cockroachdb#60560, we made the matching of types in UNIONs more strict. In the recursive CTE code, we don't allow adding casts to the initial expression, so the change caused us to regress in terms of supported queries. This change fixes this by allowing casts to the initial expression. Not sure why I didn't allow this from the get-go. Release note (bug fix): fixed "types cannot be matched for WITH RECURSIVE" error in cases where we can cast the type in the initial expression.
In cockroachdb#60560, we made the matching of types in UNIONs more strict. In the recursive CTE code, we don't allow adding casts to the initial expression, so the change caused us to regress in terms of supported queries. This change fixes this by allowing casts to the initial expression. Not sure why I didn't allow this from the get-go. Release note (bug fix): fixed "types cannot be matched for WITH RECURSIVE" error in cases where we can cast the type in the initial expression.
The optbuilder adds casts in order to match non-identical left/right types of set operations like UNION (see cockroachdb#60560). This change prevents the optbuilder from adding casts that are invalid, which caused internal errors. Now, a user error is thrown if no such cast exists from the left or right input type to the output type. Release note (bug fix): A bug has been fixed that caused internal errors in queries with set operations, like UNION, when corresponding columns on either side of the set operation were not the same. This error only occurred with a limited set of types. This bug is present in versions 20.2.6+, 21.1.0+, and 21.2.0+.
75219: opt: do not add invalid casts to match types in set operations r=mgartner a=mgartner The optbuilder adds casts in order to match non-identical left/right types of set operations like UNION (see #60560). This change prevents the optbuilder from adding casts that are invalid, which caused internal errors. Now, a user error is thrown if no such cast exists from the left or right input type to the output type. Fixes #70831 Release note (bug fix): A bug has been fixed that caused internal errors in queries with set operations, like UNION, when corresponding columns on either side of the set operation were not the same. This error only occurred with a limited set of types. This bug is present in versions 20.2.6+, 21.1.0+, and 21.2.0+. Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
The optbuilder adds casts in order to match non-identical left/right types of set operations like UNION (see #60560). This change prevents the optbuilder from adding casts that are invalid, which caused internal errors. Now, a user error is thrown if no such cast exists from the left or right input type to the output type. Release note (bug fix): A bug has been fixed that caused internal errors in queries with set operations, like UNION, when corresponding columns on either side of the set operation were not the same. This error only occurred with a limited set of types. This bug is present in versions 20.2.6+, 21.1.0+, and 21.2.0+.
The optbuilder adds casts in order to match non-identical left/right types of set operations like UNION (see #60560). This change prevents the optbuilder from adding casts that are invalid, which caused internal errors. Now, a user error is thrown if no such cast exists from the left or right input type to the output type. Release note (bug fix): A bug has been fixed that caused internal errors in queries with set operations, like UNION, when corresponding columns on either side of the set operation were not the same. This error only occurred with a limited set of types. This bug is present in versions 20.2.6+, 21.1.0+, and 21.2.0+.
The optbuilder adds casts in order to match non-identical left/right types of set operations like UNION (see cockroachdb#60560). This change prevents the optbuilder from adding casts that are invalid, which caused internal errors. Now, a user error is thrown if no such cast exists from the left or right input type to the output type. Release note (bug fix): A bug has been fixed that caused internal errors in queries with set operations, like UNION, when corresponding columns on either side of the set operation were not the same. This error only occurred with a limited set of types. This bug is present in versions 20.2.6+, 21.1.0+, and 21.2.0+.
The optbuilder adds casts in order to match non-identical left/right types of set operations like UNION (see cockroachdb#60560). This change prevents the optbuilder from adding casts that are invalid, which caused internal errors. Now, a user error is thrown if no such cast exists from the left or right input type to the output type. Release note (bug fix): A bug has been fixed that caused internal errors in queries with set operations, like UNION, when corresponding columns on either side of the set operation were not the same. This error only occurred with a limited set of types. This bug is present in versions 20.2.6+, 21.1.0+, and 21.2.0+.
The optbuilder adds casts in order to match non-identical left/right types of set operations like UNION (see #60560). This change prevents the optbuilder from adding casts that are invalid, which caused internal errors. Now, a user error is thrown if no such cast exists from the left or right input type to the output type. Release note (bug fix): A bug has been fixed that caused internal errors in queries with set operations, like UNION, when corresponding columns on either side of the set operation were not the same. This error only occurred with a limited set of types. This bug is present in versions 20.2.6+, 21.1.0+, and 21.2.0+.
This change makes the optbuilder more strict when building set
operations. Previously, it could build expressions which have
corresponding left/right types which are
Equivalent(), but notIdentical(). This leads to errors in vectorized execution, when wee.g. try to union a INT8 with an INT4.
We now make the types on both sides
Identical(), adding casts asnecessary. We try to do a best-effort attempt to use the larger
numeric type when possible (e.g. int4->int8, int->float, float->decimal).
Fixes #59148.
Release note (bug fix): fixed execution errors for some queries that
use set operations (UNION / EXCEPT / INTERSECT) where a column has
types of different widths on the two sides (e.g. INT4 vs INT8).