Skip to content

opt: updated normalization rules for folding is expressions with null predicate#56130

Merged
craig[bot] merged 1 commit intocockroachdb:masterfrom
jayshrivastava:update-fold-non-null-rules
Nov 3, 2020
Merged

opt: updated normalization rules for folding is expressions with null predicate#56130
craig[bot] merged 1 commit intocockroachdb:masterfrom
jayshrivastava:update-fold-non-null-rules

Conversation

@jayshrivastava
Copy link
Copy Markdown
Contributor

@jayshrivastava jayshrivastava commented Oct 29, 2020

opt: updated normalization rules for folding is expressions with null predicate

Previously, for statements such as SELECT (foo,bar) IS DISTINCT FROM NULL FROM a_table,
the expression (foo,bar) IS DISTINCT FROM NULL would not be normalized to true.
Similarly, if IS NOT DISTINCT FROM NULL were used, then the expression would not be
normalized to false. The previous statement would only normalize if the tuple/array in
the statement contained only constants. Given the updates in this commit, normalization
will be applied when any arrays or tuples are provided in this situation.

Release note: None

@jayshrivastava jayshrivastava requested a review from a team as a code owner October 29, 2020 22:09
@cockroach-teamcity
Copy link
Copy Markdown
Member

This change is Reviewable

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! Thanks for working on this with me.

I left a few nits and here's some commit message nits:

the expression (foo,bar) IS DISTINCT FROM NULL would not become normalized

How about "would not be normalized to false"?

The previous could would only

Remove "could".

You might also mention the IS NOT DISTINCT FROM NULL case.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava and @mgartner)


pkg/sql/opt/norm/fold_constants_funcs.go, line 196 at r1 (raw file):

// IsConstValueOrGroupOfConstValues returns true if the input is a constant or a tuple of
// constants or an array of constants.

[nit] Maybe: "is a constant, or an array or tuple will only constant elements."


pkg/sql/opt/norm/fold_constants_funcs.go, line 202 at r1 (raw file):

// IsConstValueOrTupleOrArray returns true if the input is a constant value
// or any tuple or any array

[nit] add a period at the end and maybe turn into a list: "is a constant value, any tuple, or any array".

Copy link
Copy Markdown
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava)


pkg/sql/opt/norm/fold_constants_funcs.go, line 203 at r1 (raw file):

// IsConstValueOrTupleOrArray returns true if the input is a constant value
// or any tuple or any array
func (c *CustomFuncs) IsConstValueOrTupleOrArray(input opt.ScalarExpr) bool {

We should make this IsNeverNull so that it is more in line with what we're actually using for (and we can remove the (^Null) part of the rules). We can check for TrueOp, FalseOp, TupleOp, ArrayOp, ConstOp together, which makes it easier to reason.

@mgartner
Copy link
Copy Markdown
Contributor

Looks like the build is failing for 2 reasons:

  1. Some of the plans in execbuilder tests improved. You can regenerate those with make test PKG='./pkg/sql/opt/exec/execbuilder' TESTS='TestExecBuild/local/scalar' TESTFLAGS='--rewrite=true'

  2. We have a linter, optfmt, for .opt files that is complaining. You can run make bin/optfmt && ./bin/optfmt -w pkg/sql/opt/norm/rules/scalar.opt to reformat the offending file. You can also set up optfmt to run automatically on save in GoLand:

image

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava and @RaduBerinde)


pkg/sql/opt/norm/fold_constants_funcs.go, line 203 at r1 (raw file):

Previously, RaduBerinde wrote…

We should make this IsNeverNull so that it is more in line with what we're actually using for (and we can remove the (^Null) part of the rules). We can check for TrueOp, FalseOp, TupleOp, ArrayOp, ConstOp together, which makes it easier to reason.

I like the idea of just checking for TrueOp, FalseOp, TupleOp, ArrayOp, ConstOp, and not doing any type assertions.

But do you think IsNeverNull would be confusing because (NULL, NULL) IS NULL is true. Maybe we should name the function IsAlwaysDistinctFromNull? Or flip the returned boolean and name it IsNeverDistinctFromNull. Though I don't love the idea of using "distinct" because its creates these convoluted double-negatives that are hard to reason about...

Copy link
Copy Markdown
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava and @mgartner)


pkg/sql/opt/norm/fold_constants_funcs.go, line 203 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I like the idea of just checking for TrueOp, FalseOp, TupleOp, ArrayOp, ConstOp, and not doing any type assertions.

But do you think IsNeverNull would be confusing because (NULL, NULL) IS NULL is true. Maybe we should name the function IsAlwaysDistinctFromNull? Or flip the returned boolean and name it IsNeverDistinctFromNull. Though I don't love the idea of using "distinct" because its creates these convoluted double-negatives that are hard to reason about...

IsNeverNull is consistent with the Is operator in the optimizer. We kind of already made the decision to deviate from the SQL IS for that operator. Wouldn't hurt to mention in the comment for the function though.

By the way, do we have an assertion or rule somewhere that prevents Null from being put in a ConstOp?

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava and @RaduBerinde)


pkg/sql/opt/norm/fold_constants_funcs.go, line 203 at r1 (raw file):

Previously, RaduBerinde wrote…

IsNeverNull is consistent with the Is operator in the optimizer. We kind of already made the decision to deviate from the SQL IS for that operator. Wouldn't hurt to mention in the comment for the function though.

By the way, do we have an assertion or rule somewhere that prevents Null from being put in a ConstOp?

SGTM.

I don't see anything that prevents it. ConstructConstVal prevents it, but anywhere that ConstructConst is called directly could create a Null wrapped in a ConstOp.

if d == tree.DNull {
return f.ConstructNull(t)
}

@jayshrivastava jayshrivastava force-pushed the update-fold-non-null-rules branch 3 times, most recently from fefc7fd to 5c557f3 Compare November 1, 2020 19:48
Copy link
Copy Markdown
Contributor Author

@jayshrivastava jayshrivastava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the commit message, but I believe (foo,bar) IS DISTINCT FROM NULL should normalize to true.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)


pkg/sql/opt/norm/fold_constants_funcs.go, line 203 at r1 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

SGTM.

I don't see anything that prevents it. ConstructConstVal prevents it, but anywhere that ConstructConst is called directly could create a Null wrapped in a ConstOp.

if d == tree.DNull {
return f.ConstructNull(t)
}

I'm not clear on your point about (NULL, NULL) IS NULL being true. Anyways, I believe we agree on renaming the function to IsNeverNull and changing its implementation. I dropped the (^Null) from the rule definitions as well. I'm not very familiar with the normalization code, but I feel that it would make sense to change usages of ConstructConst to call ConstructConstVal instead. We could even make constructConst private. The caveat is that ConstructConst is autogenerated (not sure from where), along with other public functions like ConstructNull, so I might be breaking a pattern by making it private.

Copy link
Copy Markdown
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 6 files at r1, 3 of 5 files at r2, 1 of 2 files at r3, 1 of 1 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jayshrivastava, @mgartner, and @RaduBerinde)


pkg/sql/opt/norm/fold_constants_funcs.go, line 195 at r4 (raw file):

}

// IsConstValueOrGroupOfConstValues returns true if the input is is a constant,

[nit] is is -> is

also remove the comma after constant


pkg/sql/opt/norm/fold_constants_funcs.go, line 196 at r4 (raw file):

// IsConstValueOrGroupOfConstValues returns true if the input is is a constant,
// or an array or tuple will only constant elements.

[nit] will -> with


pkg/sql/opt/norm/fold_constants_funcs.go, line 201 at r4 (raw file):

}

// IsNeverNull returns true if the input is a constant value,

[nit] a constant value -> a non-null constant value


pkg/sql/opt/norm/testdata/rules/comp, line 570 at r4 (raw file):

 ├── scan a
 └── projections
      └── true [as="?column?":8]

Would be good to have a couple of cases where the rule should not match. For example, you could use expect-not=FoldNonNullIsNull for SELECT (i,f) IS NULL FROM a and expect-not=FoldNonNullIsNotNull for SELECT (i,f) IS NOT NULL FROM a

… predicate

Previously, for statements such as `SELECT (foo,bar) IS DISTINCT FROM NULL FROM
a_table`, the expression `(foo,bar) IS DISTINCT FROM NULL` would not be
normalized to `true`. Similarly, if `IS NOT DISTINCT FROM NULL` were used,
then the expression would not be normalized to `false`. The previous statement
would only normalize if the tuple/array in the statement contained only
constants. Given the updates in this commit, normalization will be applied when
any arrays or tuples are provided in this situation.

Release note: None
@jayshrivastava jayshrivastava force-pushed the update-fold-non-null-rules branch from 5c557f3 to 651b88c Compare November 2, 2020 16:16
Copy link
Copy Markdown
Contributor Author

@jayshrivastava jayshrivastava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @RaduBerinde, and @rytaft)


pkg/sql/opt/norm/testdata/rules/comp, line 570 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Would be good to have a couple of cases where the rule should not match. For example, you could use expect-not=FoldNonNullIsNull for SELECT (i,f) IS NULL FROM a and expect-not=FoldNonNullIsNotNull for SELECT (i,f) IS NOT NULL FROM a

I agree 👍

Copy link
Copy Markdown
Contributor

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Thanks again!

Don't forget to update the PR description with the edited commit message since bors uses the description when creating merge commit messages.

Reviewed 2 of 5 files at r2, 2 of 2 files at r5.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jayshrivastava, @RaduBerinde, and @rytaft)


pkg/sql/opt/norm/fold_constants_funcs.go, line 203 at r1 (raw file):

I'm not clear on your point about (NULL, NULL) IS NULL being true.

This is something I avoided bringing up during our pair programming because it wasn't necessary to know for these changes and it's quite confusing. The TLDR is that (NULL, NULL) IS NULL evaluates true while (NULL, NULL) IS NOT DISTINCT FROM NULL evaluates to false. We don't need to worry about this difference in this PR because there's a special operator in "optimizer land" specifically for <tuple> IS NULL expressions, called TupleIsNull (there's a TupleIsNotNull as well), which these updated normalization rules don't match.


@RaduBerinde It's scary that nothing prevents a Const from wrapping a Null. But as far as I can tell, if there was a (Const (Null)), it would cause bugs without this change because the (^Null) pattern wouldn't prevent a match. I've created a ticket to track this: #56253

@jayshrivastava
Copy link
Copy Markdown
Contributor Author

bors r=mgartner

@craig
Copy link
Copy Markdown
Contributor

craig bot commented Nov 3, 2020

Build succeeded:

@craig craig bot merged commit 0653b92 into cockroachdb:master Nov 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants