Skip to content

optbuilder: do not CAST recursive CTE seed columns to output types#94581

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
msirek:ctePanic
Jan 4, 2023
Merged

optbuilder: do not CAST recursive CTE seed columns to output types#94581
craig[bot] merged 1 commit intocockroachdb:masterfrom
msirek:ctePanic

Conversation

@msirek
Copy link
Copy Markdown
Contributor

@msirek msirek commented Jan 2, 2023

Fixes #93371

Given the query:

SET VECTORIZE=off;
WITH RECURSIVE foo(i) AS
    (SELECT i FROM (VALUES(1),(2)) t(i)
    UNION ALL
    SELECT (i+1)::numeric(10,0) FROM foo WHERE i < 10)
 SELECT * FROM foo;

column i is CAST to the output type of the UNION ALL due to #62808 in order to support more recursive CTEs without explicit CASTS in seed (initial) expressions:

if leftCastsNeeded {
initialScope = b.addCasts(initialScope, outTypes)
}

Column i is assigned column id 3 in WithScanPrivate.InCols and the expression in recursiveScope.cols[0].scalar references it as follows:

recursiveScope.cols[0].scalar:
result = {opt.ScalarExpr | *memo.CastExpr}
 Input = {opt.ScalarExpr | *memo.PlusExpr}
  Left = {opt.ScalarExpr | *memo.VariableExpr}
   Col = {opt.ColumnID} 3
   Typ = {*types.T | 0x976b060}
    InternalType = {types.InternalType}
     Family = {types.Family} IntFamily (1)
...
    TypeMeta = {types.UserDefinedTypeMetadata}
   rank = {opt.ScalarRank} 5
  Right = {opt.ScalarExpr | *memo.ConstExpr}
   Value = {tree.Datum | *tree.DInt} 1

After the CAST is applied, column id 3 is initialized as:

initialScope.cols[0].scalar:
 result = {opt.ScalarExpr | *memo.CastExpr}
 Input = {opt.ScalarExpr | *memo.VariableExpr}
  Col = {opt.ColumnID} 1
  Typ = {*types.T | 0x976b060}
  rank = {opt.ScalarRank} 12
 Typ = {*types.T | 0xc0037f2780}
  InternalType = {types.InternalType}
   Family = {types.Family} DecimalFamily (3)
   Width = {int32} 0
   Precision = {int32} 10

During execution, an assertion fails where the input to the PlusExpr is expected to be an integer, but is a decimal:

tree.MustBeDInt (datum.go:722) github.com/cockroachdb/cockroach/pkg/sql/sem/tree
eval.(*evaluator).EvalPlusIntOp (binary_op.go:1325) github.com/cockroachdb/cockroach/pkg/sql/sem/eval
tree.(*PlusIntOp).Eval (eval_op_generated.go:761) github.com/cockroachdb/cockroach/pkg/sql/sem/tree
eval.(*evaluator).EvalBinaryExpr (expr.go:120) github.com/cockroachdb/cockroach/pkg/sql/sem/eval
tree.(*BinaryExpr).Eval (eval_expr_generated.go:86) github.com/cockroachdb/cockroach/pkg/sql/sem/tree
eval.(*evaluator).EvalCastExpr (expr.go:180) github.com/cockroachdb/cockroach/pkg/sql/sem/eval
tree.(*CastExpr).Eval (eval_expr_generated.go:96) github.com/cockroachdb/cockroach/pkg/sql/sem/tree
eval.Expr (expr.go:26) github.com/cockroachdb/cockroach/pkg/sql/sem/eval
execinfrapb.(*ExprHelper).Eval (expr.go:248) github.com/cockroachdb/cockroach/pkg/sql/execinfrapb
execinfra.(*ProcOutputHelper).ProcessRow (processorsbase.go:275) github.com/cockroachdb/cockroach/pkg/sql/execinfra
...

The left of the Plus is referring to the IndexedVar column at ordinal 0:

result = {*tree.BinaryExpr | 0xc0022e27c0}
 Operator = {treebin.BinaryOperator}
  Symbol = {treebin.BinaryOperatorSymbol} Plus (3)
  IsExplicitOperator = {bool} false
 Left = {tree.Expr | *tree.IndexedVar}
  Idx = {int} 0
  Used = {bool} true
  col = {tree.NodeFormatter | *colinfo.varFormatter}
  typeAnnotation = {tree.typeAnnotation}
 Right = {tree.Expr | *tree.DInt} 1
 typeAnnotation = {tree.typeAnnotation}
 Op = {*tree.BinOp | 0xc0001817a0}

which has been CAST as a decimal, as indicated above.

To fully support casting of seed expressions, that may require replacing
all occurences of those expressions in recursiveScope with the new
CASTed expressions, and possibly re-doing type checking to catch any
illegal casts, etc.

This issue is solved by reverting #62808 in order to maintain
compatibility with Postgres, which errors out this query.
Relaxing the requirement for explicit CASTs in recursive CTE seed
expressions could possibly be supported in the future, but more work is
required to substitute seed expressions present in the recursive branch
with the implicitly CASTed expressions and make sure type checking is
complete.

Release note(bug fix): This patch fixes recursive CTE expressions which
cause internal errors when explicit CASTs of initial expressions to
output types are missing.

@msirek msirek requested a review from yuzefovich January 2, 2023 03:41
@msirek msirek requested a review from a team as a code owner January 2, 2023 03:41
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! It's a bit surprising that Postgres won't automatically cast when an implicit cast is allowed, like a cast from INT -> NUMERIC:

marcus=# select * from pg_cast where castsource = 'int4'::regtype and casttarget = 'numeric'::regtype;
  oid  | castsource | casttarget | castfunc | castcontext | castmethod
-------+------------+------------+----------+-------------+------------
 10014 |         23 |       1700 |     1740 | i           | f
(1 row)

I checked PG's type conversion docs to see if there was any explanation of this, but couldn't find anything. 🤷

:lgtm: Once you fixup the two tests.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @msirek and @yuzefovich)


pkg/sql/opt/norm/testdata/rules/with line 1872 at r1 (raw file):

  (SELECT 1, 2)
  UNION ALL
  (SELECT i+1, sum(j) FROM cte WHERE i <= 10 GROUP BY i)

nit: sum(j)::INT to avoid the error


pkg/sql/opt/norm/testdata/rules/with line 2422 at r1 (raw file):

  (SELECT 1, 2)
  UNION ALL
  (SELECT i+1, sum(x) FROM cte INNER JOIN xy ON j = y WHERE i <= 10 GROUP BY i)

nit: sum(x)::INT to avoid the error

Copy link
Copy Markdown
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @msirek and @yuzefovich)

Copy link
Copy Markdown
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Marcus that it'd be nicer to add explicit casts in some of the affected tests. :lgtm: though.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: :shipit: complete! 3 of 0 LGTMs obtained (waiting on @msirek)


pkg/sql/opt/optbuilder/with.go line 341 at r1 (raw file):

				panic(pgerror.Newf(
					pgcode.DatatypeMismatch,
					"UNION types %s and %s cannot be matched for WITH RECURSIVE",

nit: I see that this is the error message we had prior to #62808, but we generally try to match the postgres error message as close as possible (at least that what I was told during my internship four years ago), and PG returns:

ERROR:  recursive query "foo" column 1 has type integer in non-recursive term but type numeric overall

for the original repro from the issue.

@mgartner
Copy link
Copy Markdown
Contributor

mgartner commented Jan 3, 2023

pkg/sql/opt/optbuilder/with.go line 341 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: I see that this is the error message we had prior to #62808, but we generally try to match the postgres error message as close as possible (at least that what I was told during my internship four years ago), and PG returns:

ERROR:  recursive query "foo" column 1 has type integer in non-recursive term but type numeric overall

for the original repro from the issue.

Good catch. Let's match PG error messages as close as reasonably possible 👍

Fixes cockroachdb#93371

Given the query:
```
SET VECTORIZE=off;
WITH RECURSIVE foo(i) AS
    (SELECT i FROM (VALUES(1),(2)) t(i)
    UNION ALL
    SELECT (i+1)::numeric(10,0) FROM foo WHERE i < 10)
 SELECT * FROM foo;
```
column `i` is CAST to the output type of the UNION ALL due to cockroachdb#62808
in order to support more recursive CTEs without explicit CASTS in
seed (initial) expressions:
https://github.com/cockroachdb/cockroach/blob/0eac48e70abbdd163d4617e50659321b99d678d4/pkg/sql/opt/optbuilder/with.go#L335-L337

Column `i` is assigned column id 3 in `WithScanPrivate.InCols` and
the expression in `recursiveScope.cols[0].scalar` references it as
follows:
```
recursiveScope.cols[0].scalar:
result = {opt.ScalarExpr | *memo.CastExpr}
 Input = {opt.ScalarExpr | *memo.PlusExpr}
  Left = {opt.ScalarExpr | *memo.VariableExpr}
   Col = {opt.ColumnID} 3
   Typ = {*types.T | 0x976b060}
    InternalType = {types.InternalType}
     Family = {types.Family} IntFamily (1)
...
    TypeMeta = {types.UserDefinedTypeMetadata}
   rank = {opt.ScalarRank} 5
  Right = {opt.ScalarExpr | *memo.ConstExpr}
   Value = {tree.Datum | *tree.DInt} 1
```
After the CAST is applied, column id 3 is initialized as:
```
initialScope.cols[0].scalar:
 result = {opt.ScalarExpr | *memo.CastExpr}
 Input = {opt.ScalarExpr | *memo.VariableExpr}
  Col = {opt.ColumnID} 1
  Typ = {*types.T | 0x976b060}
  rank = {opt.ScalarRank} 12
 Typ = {*types.T | 0xc0037f2780}
  InternalType = {types.InternalType}
   Family = {types.Family} DecimalFamily (3)
   Width = {int32} 0
   Precision = {int32} 10
```
During execution, an assertion fails where the input to the `PlusExpr` is
expected to be an integer, but is a decimal:
```
tree.MustBeDInt (datum.go:722) github.com/cockroachdb/cockroach/pkg/sql/sem/tree
eval.(*evaluator).EvalPlusIntOp (binary_op.go:1325) github.com/cockroachdb/cockroach/pkg/sql/sem/eval
tree.(*PlusIntOp).Eval (eval_op_generated.go:761) github.com/cockroachdb/cockroach/pkg/sql/sem/tree
eval.(*evaluator).EvalBinaryExpr (expr.go:120) github.com/cockroachdb/cockroach/pkg/sql/sem/eval
tree.(*BinaryExpr).Eval (eval_expr_generated.go:86) github.com/cockroachdb/cockroach/pkg/sql/sem/tree
eval.(*evaluator).EvalCastExpr (expr.go:180) github.com/cockroachdb/cockroach/pkg/sql/sem/eval
tree.(*CastExpr).Eval (eval_expr_generated.go:96) github.com/cockroachdb/cockroach/pkg/sql/sem/tree
eval.Expr (expr.go:26) github.com/cockroachdb/cockroach/pkg/sql/sem/eval
execinfrapb.(*ExprHelper).Eval (expr.go:248) github.com/cockroachdb/cockroach/pkg/sql/execinfrapb
execinfra.(*ProcOutputHelper).ProcessRow (processorsbase.go:275) github.com/cockroachdb/cockroach/pkg/sql/execinfra
...
```
The left of the Plus is referring to the `IndexedVar` column at ordinal 0:
```
result = {*tree.BinaryExpr | 0xc0022e27c0}
 Operator = {treebin.BinaryOperator}
  Symbol = {treebin.BinaryOperatorSymbol} Plus (3)
  IsExplicitOperator = {bool} false
 Left = {tree.Expr | *tree.IndexedVar}
  Idx = {int} 0
  Used = {bool} true
  col = {tree.NodeFormatter | *colinfo.varFormatter}
  typeAnnotation = {tree.typeAnnotation}
 Right = {tree.Expr | *tree.DInt} 1
 typeAnnotation = {tree.typeAnnotation}
 Op = {*tree.BinOp | 0xc0001817a0}
 ```
which has been CAST as a decimal, as indicated above.

To fully support casting of seed expressions, that may require replacing
all occurences of those expressions in `recursiveScope` with the new
CASTed expressions, and possibly re-doing type checking to catch any
illegal casts, etc.

This issue is solved by reverting cockroachdb#62808 in order to maintain
compatibility with Postgres, which errors out this query.
Relaxing the requirement for explicit CASTs in recursive CTE seed
expressions could possibly be supported in the future, but more work is
required to substitute seed expressions present in the recursive branch
with the implicitly CASTed expressions and make sure type checking is
complete.

Release note (bug fix): This patch fixes recursive CTE expressions which
cause internal errors when explicit CASTs of initial expressions to
output types are missing.
Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added explicit casts in pkg/sql/opt/norm/testdata/rules/with as suggested. Explicit casts are not added in pkg/sql/opt/optbuilder/with.go as those were originally testing for presence of an implicit cast, so now are testing for the absence of an implicit cast.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 3 stale) (waiting on @mgartner and @yuzefovich)


pkg/sql/opt/optbuilder/with.go line 341 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Good catch. Let's match PG error messages as close as reasonably possible 👍

Done


pkg/sql/opt/norm/testdata/rules/with line 1872 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: sum(j)::INT to avoid the error

Done


pkg/sql/opt/norm/testdata/rules/with line 2422 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

nit: sum(x)::INT to avoid the error

Done

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 4 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @msirek)

Copy link
Copy Markdown
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!
bors r+

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 2 stale) (waiting on @msirek)

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Jan 4, 2023

Build succeeded:

@craig craig bot merged commit 9e54d50 into cockroachdb:master Jan 4, 2023
@blathers-crl
Copy link
Copy Markdown

blathers-crl bot commented Jan 4, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from f4856ec to blathers/backport-release-22.1-94581: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1.x failed. See errors above.


error creating merge commit from f4856ec to blathers/backport-release-22.2-94581: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sql: internal type mismatch error from vectorized engine caused by recursive CTE

5 participants