opt: updated normalization rules for folding is expressions with null predicate#56130
Conversation
mgartner
left a comment
There was a problem hiding this comment.
Great work! Thanks for working on this with me.
I left a few nits and here's some commit message nits:
the expression (foo,bar) IS DISTINCT FROM NULL would not become normalized
How about "would not be normalized to false"?
The previous could would only
Remove "could".
You might also mention the IS NOT DISTINCT FROM NULL case.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava and @mgartner)
pkg/sql/opt/norm/fold_constants_funcs.go, line 196 at r1 (raw file):
// IsConstValueOrGroupOfConstValues returns true if the input is a constant or a tuple of // constants or an array of constants.
[nit] Maybe: "is a constant, or an array or tuple will only constant elements."
pkg/sql/opt/norm/fold_constants_funcs.go, line 202 at r1 (raw file):
// IsConstValueOrTupleOrArray returns true if the input is a constant value // or any tuple or any array
[nit] add a period at the end and maybe turn into a list: "is a constant value, any tuple, or any array".
RaduBerinde
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava)
pkg/sql/opt/norm/fold_constants_funcs.go, line 203 at r1 (raw file):
// IsConstValueOrTupleOrArray returns true if the input is a constant value // or any tuple or any array func (c *CustomFuncs) IsConstValueOrTupleOrArray(input opt.ScalarExpr) bool {
We should make this IsNeverNull so that it is more in line with what we're actually using for (and we can remove the (^Null) part of the rules). We can check for TrueOp, FalseOp, TupleOp, ArrayOp, ConstOp together, which makes it easier to reason.
|
Looks like the build is failing for 2 reasons:
|
mgartner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava and @RaduBerinde)
pkg/sql/opt/norm/fold_constants_funcs.go, line 203 at r1 (raw file):
Previously, RaduBerinde wrote…
We should make this
IsNeverNullso that it is more in line with what we're actually using for (and we can remove the(^Null)part of the rules). We can check for TrueOp, FalseOp, TupleOp, ArrayOp, ConstOp together, which makes it easier to reason.
I like the idea of just checking for TrueOp, FalseOp, TupleOp, ArrayOp, ConstOp, and not doing any type assertions.
But do you think IsNeverNull would be confusing because (NULL, NULL) IS NULL is true. Maybe we should name the function IsAlwaysDistinctFromNull? Or flip the returned boolean and name it IsNeverDistinctFromNull. Though I don't love the idea of using "distinct" because its creates these convoluted double-negatives that are hard to reason about...
RaduBerinde
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava and @mgartner)
pkg/sql/opt/norm/fold_constants_funcs.go, line 203 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I like the idea of just checking for TrueOp, FalseOp, TupleOp, ArrayOp, ConstOp, and not doing any type assertions.
But do you think
IsNeverNullwould be confusing because(NULL, NULL) IS NULLis true. Maybe we should name the functionIsAlwaysDistinctFromNull? Or flip the returned boolean and name itIsNeverDistinctFromNull. Though I don't love the idea of using "distinct" because its creates these convoluted double-negatives that are hard to reason about...
IsNeverNull is consistent with the Is operator in the optimizer. We kind of already made the decision to deviate from the SQL IS for that operator. Wouldn't hurt to mention in the comment for the function though.
By the way, do we have an assertion or rule somewhere that prevents Null from being put in a ConstOp?
mgartner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava and @RaduBerinde)
pkg/sql/opt/norm/fold_constants_funcs.go, line 203 at r1 (raw file):
Previously, RaduBerinde wrote…
IsNeverNullis consistent with theIsoperator in the optimizer. We kind of already made the decision to deviate from the SQLISfor that operator. Wouldn't hurt to mention in the comment for the function though.By the way, do we have an assertion or rule somewhere that prevents Null from being put in a
ConstOp?
SGTM.
I don't see anything that prevents it. ConstructConstVal prevents it, but anywhere that ConstructConst is called directly could create a Null wrapped in a ConstOp.
cockroach/pkg/sql/opt/norm/factory.go
Lines 346 to 348 in 8bfb137
fefc7fd to
5c557f3
Compare
jayshrivastava
left a comment
There was a problem hiding this comment.
I updated the commit message, but I believe (foo,bar) IS DISTINCT FROM NULL should normalize to true.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)
pkg/sql/opt/norm/fold_constants_funcs.go, line 203 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
SGTM.
I don't see anything that prevents it.
ConstructConstValprevents it, but anywhere thatConstructConstis called directly could create a Null wrapped in aConstOp.cockroach/pkg/sql/opt/norm/factory.go
Lines 346 to 348 in 8bfb137
I'm not clear on your point about (NULL, NULL) IS NULL being true. Anyways, I believe we agree on renaming the function to IsNeverNull and changing its implementation. I dropped the (^Null) from the rule definitions as well. I'm not very familiar with the normalization code, but I feel that it would make sense to change usages of ConstructConst to call ConstructConstVal instead. We could even make constructConst private. The caveat is that ConstructConst is autogenerated (not sure from where), along with other public functions like ConstructNull, so I might be breaking a pattern by making it private.
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 2 of 6 files at r1, 3 of 5 files at r2, 1 of 2 files at r3, 1 of 1 files at r4.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava, @mgartner, and @RaduBerinde)
pkg/sql/opt/norm/fold_constants_funcs.go, line 195 at r4 (raw file):
} // IsConstValueOrGroupOfConstValues returns true if the input is is a constant,
[nit] is is -> is
also remove the comma after constant
pkg/sql/opt/norm/fold_constants_funcs.go, line 196 at r4 (raw file):
// IsConstValueOrGroupOfConstValues returns true if the input is is a constant, // or an array or tuple will only constant elements.
[nit] will -> with
pkg/sql/opt/norm/fold_constants_funcs.go, line 201 at r4 (raw file):
} // IsNeverNull returns true if the input is a constant value,
[nit] a constant value -> a non-null constant value
pkg/sql/opt/norm/testdata/rules/comp, line 570 at r4 (raw file):
├── scan a └── projections └── true [as="?column?":8]
Would be good to have a couple of cases where the rule should not match. For example, you could use expect-not=FoldNonNullIsNull for SELECT (i,f) IS NULL FROM a and expect-not=FoldNonNullIsNotNull for SELECT (i,f) IS NOT NULL FROM a
… predicate Previously, for statements such as `SELECT (foo,bar) IS DISTINCT FROM NULL FROM a_table`, the expression `(foo,bar) IS DISTINCT FROM NULL` would not be normalized to `true`. Similarly, if `IS NOT DISTINCT FROM NULL` were used, then the expression would not be normalized to `false`. The previous statement would only normalize if the tuple/array in the statement contained only constants. Given the updates in this commit, normalization will be applied when any arrays or tuples are provided in this situation. Release note: None
5c557f3 to
651b88c
Compare
jayshrivastava
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @RaduBerinde, and @rytaft)
pkg/sql/opt/norm/testdata/rules/comp, line 570 at r4 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Would be good to have a couple of cases where the rule should not match. For example, you could use
expect-not=FoldNonNullIsNullforSELECT (i,f) IS NULL FROM aandexpect-not=FoldNonNullIsNotNullforSELECT (i,f) IS NOT NULL FROM a
I agree 👍
mgartner
left a comment
There was a problem hiding this comment.
Don't forget to update the PR description with the edited commit message since bors uses the description when creating merge commit messages.
Reviewed 2 of 5 files at r2, 2 of 2 files at r5.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @jayshrivastava, @RaduBerinde, and @rytaft)
pkg/sql/opt/norm/fold_constants_funcs.go, line 203 at r1 (raw file):
I'm not clear on your point about (NULL, NULL) IS NULL being true.
This is something I avoided bringing up during our pair programming because it wasn't necessary to know for these changes and it's quite confusing. The TLDR is that (NULL, NULL) IS NULL evaluates true while (NULL, NULL) IS NOT DISTINCT FROM NULL evaluates to false. We don't need to worry about this difference in this PR because there's a special operator in "optimizer land" specifically for <tuple> IS NULL expressions, called TupleIsNull (there's a TupleIsNotNull as well), which these updated normalization rules don't match.
@RaduBerinde It's scary that nothing prevents a Const from wrapping a Null. But as far as I can tell, if there was a (Const (Null)), it would cause bugs without this change because the (^Null) pattern wouldn't prevent a match. I've created a ticket to track this: #56253
|
bors r=mgartner |
|
Build succeeded: |

opt: updated normalization rules for folding is expressions with null predicate
Previously, for statements such as
SELECT (foo,bar) IS DISTINCT FROM NULL FROM a_table,the expression
(foo,bar) IS DISTINCT FROM NULLwould not be normalized totrue.Similarly, if
IS NOT DISTINCT FROM NULLwere used, then the expression would not benormalized to
false. The previous statement would only normalize if the tuple/array inthe statement contained only constants. Given the updates in this commit, normalization
will be applied when any arrays or tuples are provided in this situation.
Release note: None