opt: prevent overflow when normalizing comparison with constants#88199
opt: prevent overflow when normalizing comparison with constants#88199craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
msirek
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @michae2)
pkg/sql/logictest/testdata/logic_test/time line 566 at r1 (raw file):
SELECT t + '02:00:00'::INTERVAL < '01:00:00'::TIME FROM t88128 ---- false
We may need to check for overflow and underflow for both + and -, for example this test still shows incorrect results:
query B
SELECT t + '-18:00:00'::INTERVAL < '07:00:00'::TIME FROM t88128
----
true
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @michae2)
pkg/sql/opt/norm/comp_funcs.go line 44 at r1 (raw file):
// 1. An overload function for the given operator and input types exists and // has an appropriate volatility. // 2. The result type of the overload is equivalent to the type of left. This
Why is this condition necessary? I don't think the datum comparison functions require the same type, at least in general.
|
Previously, DrewKimball (Drew Kimball) wrote…
They must be comparable in order to check for overflow. For example, consider: CREATE TABLE t (i INTERVAL);
SELECT '1:00:00'::TIME + i >= '2:00:00'::TIME;With this tranformation, we try to compute Perhaps when a subtraction yields a different type than the operands, it is guaranteed there is no overflow? I didn't spend too much time thinking through if this is true generally, but if you have any insight, let me know. I thought for now I'd err on the side of overly restrictive (I even considered removing these rules entirely since I don't believe it's common to write expressions in this form and I'm weary of other bugs lurking here). I could leave a TODO mentioning that this restriction might be able to be lifted, if you'd like. |
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @michae2)
pkg/sql/opt/norm/comp_funcs.go line 44 at r1 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
They must be comparable in order to check for overflow. For example, consider:
CREATE TABLE t (i INTERVAL); SELECT '1:00:00'::TIME + i >= '2:00:00'::TIME;With this tranformation, we try to compute
'2:00:00'::TIME - '1:00:00'::TIME, which results in anINTERVAL, not aTIME. Because anINTERVALcannot be compared with aTIME, the overflow detection won't work.Perhaps when a subtraction yields a different type than the operands, it is guaranteed there is no overflow? I didn't spend too much time thinking through if this is true generally, but if you have any insight, let me know. I thought for now I'd err on the side of overly restrictive (I even considered removing these rules entirely since I don't believe it's common to write expressions in this form and I'm weary of other bugs lurking here). I could leave a TODO mentioning that this restriction might be able to be lifted, if you'd like.
Oh, I didn't realize INTERVAL and TIME are incomparable. I think it's fine to leave as is, in that case. Thanks for explaining.
18168e0 to
c7b68e9
Compare
mgartner
left a comment
There was a problem hiding this comment.
I added a new commit which fixes the same bug in the normalize package (a remnant of the old heuristic optimizer that is still used during backfills). PTAL!
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @msirek)
pkg/sql/logictest/testdata/logic_test/time line 566 at r1 (raw file):
Previously, msirek (Mark Sirek) wrote…
We may need to check for overflow and underflow for both
+and-, for example this test still shows incorrect results:query B SELECT t + '-18:00:00'::INTERVAL < '07:00:00'::TIME FROM t88128 ---- true
Woof, good catch!
I think the only way to detect underflow and overflow in these cases is to know if the RHS of the new +/- expression is > 0 or < 0. I've added in logic to detect overflow and underflow for all three rules, but I had to add the additional restriction that the RHS of the new +/- expression is an integer, float, decimal, or interval - because its easy to create zero values for those types. It may be possible to expand this set of valid types in the future, but I think this is good enough for now.
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewed 2 of 7 files at r1, 6 of 6 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @msirek)
msirek
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @msirek)
pkg/sql/opt/norm/testdata/rules/comp line 132 at r2 (raw file):
└── (s:4::DATE + '02:00:00') = '2000-01-01 02:00:00' [outer=(4), stable] # The rule should not apply if the type of RHS the created Minus operator is a
nit: 'is a' -> 'is'
Previously, the `NormalizeCmpPlusConst`, `NormalizeCmpMinusConst`, and
`NormalizeCmpMinusConst` would normalize expressions in the form
`var +/- const1 ? const2` and `const1 - var ? const2`, where `const1`
and `const2` were const constants and `?` is any comparison operator, to
attempt to isolate the variable on one side of the comparison. Below are
some examples of the transformations these rules made:
a+1 > 10 => a > 10-1
a-1 > 10 => a > 10+1
1-a > 10 => 1-10 > a
However, these transformations were invalid when the newly created
addition or subtraction operators overflowed. In the case of an
expression involving TIME and INTERVAL types, this could cause incorrect
query results. For example:
t-'2 hr'::INTERVAL < '23:00:00'::TIME
=>
t < '23:00:00'::TIME+'2 hr'::INTERVAL
=>
t < '01:00:00'::TIME
The first expression and last expression have different semantic meaning
because the addition of the constant TIME and constant INTERVAL
overflowed.
This commit prevents the rules from transforming expressions if they
would cause an overflow/underflow. In order to detect overflow, there
are new restrictions that prevent these rules from firing in some cases.
Notably, the datum on the RHS of the newly constructed +/- operator must
be an integer, float, decimal, or interval. Also, the result type of the
newly constructed +/- operator must be equivalent to the LHS of the
operator. Both of these restrictions are required in order to detect
overflow/underflow.
Informs cockroachdb#88128
Release note (bug fix): A bug has been fixed that caused incorrect
evaluation of expressions in the form `col +/- const1 ? const2`, where
`const1` and `const2` are constant values and `?` is any comparison
operator. The bug was caused by operator overflow when the optimizer
attempted to simplify these expressions to have a single constant value.
It was recently discovered that normalization rules in the optimizer were invalid (see cockroachdb#88128). These rules were adapted from the heuristic optimizer several years ago, and they still remain in the `normalize` package even though the heuristic optimizer no longer exists. The `normalize` package is still used to normalize expressions during backfilling, so the invalid rules can cause incorrect computed column values and corrupt indexes. This commit removes these rules entirely. Informs cockroachdb#88128 Release note: None
c7b68e9 to
bb8b314
Compare
mgartner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 2 of 0 LGTMs obtained (waiting on @DrewKimball, @michae2, and @msirek)
pkg/sql/opt/norm/comp_funcs.go line 44 at r1 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
Oh, I didn't realize
INTERVALandTIMEare incomparable. I think it's fine to leave as is, in that case. Thanks for explaining.
Done.
pkg/sql/opt/norm/testdata/rules/comp line 132 at r2 (raw file):
Previously, msirek (Mark Sirek) wrote…
nit: 'is a' -> 'is'
Done
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @michae2, and @msirek)
|
TFTRs! bors r+ |
|
Build succeeded: |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 22a967e to blathers/backport-release-21.2-88199: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. error creating merge commit from 22a967e to blathers/backport-release-22.1-88199: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
This commit fixes minor typos introduced in #88199. Release note: None
This commit fixes minor typos introduced in #88199. Release note: None
A prior commit in cockroachdb#88199 attempted to fix a bug in the `NormalizeCmpPlusConst`, `NormalizeCmpMinusConst`, and `NormalizeCmpConstMinus` rules by checking for overflow/underflow in the addition/subtraction of constants in a comparison expression. This was insufficient to completely fix the bug because the transformation is invalid if the non-normalized expression would have overflowed. Consider an expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME `NormalizeCmpPlusConst` would successively normalize this: t::TIME > '01:00'::TIME - '-11 hrs'::INTERVAL => t::TIME > '12:00'::TIME This expression is not semantically equivalent to the original expression. It yields different results when `t` is a value that would underflow when eleven hours is subtracted from it. For example, consider `t = '03:00'::TIME`: Original expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '03:00'::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '16:00'::TIME > '01:00'::TIME => true Normalized expression: t::TIME > '12:00'::TIME => '03:00'::TIME > '12:00'::TIME => false These normalization rules are only valid with types where overflow or underflow during addition and subtraction results in an error. This commit restricts these normalization rules to only operate on integers, floats, and decimals, which will error if there is underflow or overflow. Fixes cockroachdb#90053 Release note (bug fix): A bug has been fixed that caused incorrect evaluation of comparison expressions involving time and interval types, like `col::TIME + '10 hrs'::INTERVAL' > '01:00'::TIME`.
A prior commit in cockroachdb#88199 attempted to fix a bug in the `NormalizeCmpPlusConst`, `NormalizeCmpMinusConst`, and `NormalizeCmpConstMinus` rules by checking for overflow/underflow in the addition/subtraction of constants in a comparison expression. This was insufficient to completely fix the bug because the transformation is invalid if the non-normalized expression would have overflowed. Consider an expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME `NormalizeCmpPlusConst` would successively normalize it to this: t::TIME > '01:00'::TIME - '-11 hrs'::INTERVAL => t::TIME > '12:00'::TIME This expression is not semantically equivalent to the original expression. It yields different results when `t` is a value that would underflow when eleven hours is subtracted from it. For example, consider `t = '03:00'::TIME`: Original expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '03:00'::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '16:00'::TIME > '01:00'::TIME => true Normalized expression: t::TIME > '12:00'::TIME => '03:00'::TIME > '12:00'::TIME => false These normalization rules are only valid with types where overflow or underflow during addition and subtraction results in an error. This commit restricts these normalization rules to only operate on integers, floats, and decimals, which will error if there is underflow or overflow. Fixes cockroachdb#90053 Release note (bug fix): A bug has been fixed that caused incorrect evaluation of comparison expressions involving time and interval types, like `col::TIME + '10 hrs'::INTERVAL' > '01:00'::TIME`.
89989: ccl/jwtauthccl: allow inferring of key algorithm r=kpatron-cockroachlabs a=kpatron-cockroachlabs Previously, if a user failed to provide the algorithm claim within the JWKS, CRDB would not accept any JWTs that used that key. This change makes CRDB accept algorithms from specified by a JWT so long as they are compatible with the key type (for example RSA and RSA256). This case only applies when the JWKS does not explicitly specify an algorithm. This came up because Azure's listed JWKS does not specify algorithms. Fixes CC-8211. Release note (bug fix): During JWT based auth, infer the algorithm type if it is not specified by the JWKS. This enbables support for a wider range of JWKSes. Release justification: low danger, usability fix. 90214: encoding: make DecodeFloatDescending return same NaN as ascending r=DrewKimball,msirek a=michae2 There are multiple representations of NaN in floating point. We were returning a different representation for descending indexes. There are some functions, such as `st_makepointm` that are sensitive to the difference, and this complicates testing. So let's always return the same bitwise representation for NaN. Fixes: #89961 Release note: None 90266: opt: fix normalization of comparisons with constants r=mgartner a=mgartner #### opt: fix normalization of comparisons with constants A prior commit in #88199 attempted to fix a bug in the `NormalizeCmpPlusConst`, `NormalizeCmpMinusConst`, and `NormalizeCmpConstMinus` rules by checking for overflow/underflow in the addition/subtraction of constants in a comparison expression. This was insufficient to completely fix the bug because the transformation is invalid if the non-normalized expression would have overflowed. Consider an expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME `NormalizeCmpPlusConst` would successively normalize it to this: t::TIME > '01:00'::TIME - '-11 hrs'::INTERVAL => t::TIME > '12:00'::TIME This expression is not semantically equivalent to the original expression. It yields different results when `t` is a value that would underflow when eleven hours is subtracted from it. For example, consider `t = '03:00'::TIME`: Original expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '03:00'::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '16:00'::TIME > '01:00'::TIME => true Normalized expression: t::TIME > '12:00'::TIME => '03:00'::TIME > '12:00'::TIME => false These normalization rules are only valid with types where overflow or underflow during addition and subtraction results in an error. This commit restricts these normalization rules to only operate on integers, floats, and decimals, which will error if there is underflow or overflow. Fixes #90053 Release note (bug fix): A bug has been fixed that caused incorrect evaluation of comparison expressions involving time and interval types, like `col::TIME + '10 hrs'::INTERVAL' > '01:00'::TIME`. #### opt: replace FoldBinaryCheckOverflow with FoldBinary Release note: None Co-authored-by: Kyle Patron <kyle@cockroachlabs.com> Co-authored-by: Michael Erickson <michae2@cockroachlabs.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
A prior commit in cockroachdb#88199 attempted to fix a bug in the `NormalizeCmpPlusConst`, `NormalizeCmpMinusConst`, and `NormalizeCmpConstMinus` rules by checking for overflow/underflow in the addition/subtraction of constants in a comparison expression. This was insufficient to completely fix the bug because the transformation is invalid if the non-normalized expression would have overflowed. Consider an expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME `NormalizeCmpPlusConst` would successively normalize it to this: t::TIME > '01:00'::TIME - '-11 hrs'::INTERVAL => t::TIME > '12:00'::TIME This expression is not semantically equivalent to the original expression. It yields different results when `t` is a value that would underflow when eleven hours is subtracted from it. For example, consider `t = '03:00'::TIME`: Original expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '03:00'::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '16:00'::TIME > '01:00'::TIME => true Normalized expression: t::TIME > '12:00'::TIME => '03:00'::TIME > '12:00'::TIME => false These normalization rules are only valid with types where overflow or underflow during addition and subtraction results in an error. This commit restricts these normalization rules to only operate on integers, floats, and decimals, which will error if there is underflow or overflow. Fixes cockroachdb#90053 Release note (bug fix): A bug has been fixed that caused incorrect evaluation of comparison expressions involving time and interval types, like `col::TIME + '10 hrs'::INTERVAL' > '01:00'::TIME`.
A prior commit in cockroachdb#88199 attempted to fix a bug in the `NormalizeCmpPlusConst`, `NormalizeCmpMinusConst`, and `NormalizeCmpConstMinus` rules by checking for overflow/underflow in the addition/subtraction of constants in a comparison expression. This was insufficient to completely fix the bug because the transformation is invalid if the non-normalized expression would have overflowed. Consider an expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME `NormalizeCmpPlusConst` would successively normalize it to this: t::TIME > '01:00'::TIME - '-11 hrs'::INTERVAL => t::TIME > '12:00'::TIME This expression is not semantically equivalent to the original expression. It yields different results when `t` is a value that would underflow when eleven hours is subtracted from it. For example, consider `t = '03:00'::TIME`: Original expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '03:00'::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '16:00'::TIME > '01:00'::TIME => true Normalized expression: t::TIME > '12:00'::TIME => '03:00'::TIME > '12:00'::TIME => false These normalization rules are only valid with types where overflow or underflow during addition and subtraction results in an error. This commit restricts these normalization rules to only operate on integers, floats, and decimals, which will error if there is underflow or overflow. Fixes cockroachdb#90053 Release note (bug fix): A bug has been fixed that caused incorrect evaluation of comparison expressions involving time and interval types, like `col::TIME + '10 hrs'::INTERVAL' > '01:00'::TIME`.
A prior commit in cockroachdb#88199 attempted to fix a bug in the `NormalizeCmpPlusConst`, `NormalizeCmpMinusConst`, and `NormalizeCmpConstMinus` rules by checking for overflow/underflow in the addition/subtraction of constants in a comparison expression. This was insufficient to completely fix the bug because the transformation is invalid if the non-normalized expression would have overflowed. Consider an expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME `NormalizeCmpPlusConst` would successively normalize it to this: t::TIME > '01:00'::TIME - '-11 hrs'::INTERVAL => t::TIME > '12:00'::TIME This expression is not semantically equivalent to the original expression. It yields different results when `t` is a value that would underflow when eleven hours is subtracted from it. For example, consider `t = '03:00'::TIME`: Original expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '03:00'::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '16:00'::TIME > '01:00'::TIME => true Normalized expression: t::TIME > '12:00'::TIME => '03:00'::TIME > '12:00'::TIME => false These normalization rules are only valid with types where overflow or underflow during addition and subtraction results in an error. This commit restricts these normalization rules to only operate on integers, floats, and decimals, which will error if there is underflow or overflow. Fixes cockroachdb#90053 Release note (bug fix): A bug has been fixed that caused incorrect evaluation of comparison expressions involving time and interval types, like `col::TIME + '10 hrs'::INTERVAL' > '01:00'::TIME`.
A prior commit in #88199 attempted to fix a bug in the `NormalizeCmpPlusConst`, `NormalizeCmpMinusConst`, and `NormalizeCmpConstMinus` rules by checking for overflow/underflow in the addition/subtraction of constants in a comparison expression. This was insufficient to completely fix the bug because the transformation is invalid if the non-normalized expression would have overflowed. Consider an expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME `NormalizeCmpPlusConst` would successively normalize it to this: t::TIME > '01:00'::TIME - '-11 hrs'::INTERVAL => t::TIME > '12:00'::TIME This expression is not semantically equivalent to the original expression. It yields different results when `t` is a value that would underflow when eleven hours is subtracted from it. For example, consider `t = '03:00'::TIME`: Original expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '03:00'::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '16:00'::TIME > '01:00'::TIME => true Normalized expression: t::TIME > '12:00'::TIME => '03:00'::TIME > '12:00'::TIME => false These normalization rules are only valid with types where overflow or underflow during addition and subtraction results in an error. This commit restricts these normalization rules to only operate on integers, floats, and decimals, which will error if there is underflow or overflow. Fixes #90053 Release note (bug fix): A bug has been fixed that caused incorrect evaluation of comparison expressions involving time and interval types, like `col::TIME + '10 hrs'::INTERVAL' > '01:00'::TIME`.
A prior commit in cockroachdb#88199 attempted to fix a bug in the `NormalizeCmpPlusConst`, `NormalizeCmpMinusConst`, and `NormalizeCmpConstMinus` rules by checking for overflow/underflow in the addition/subtraction of constants in a comparison expression. This was insufficient to completely fix the bug because the transformation is invalid if the non-normalized expression would have overflowed. Consider an expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME `NormalizeCmpPlusConst` would successively normalize it to this: t::TIME > '01:00'::TIME - '-11 hrs'::INTERVAL => t::TIME > '12:00'::TIME This expression is not semantically equivalent to the original expression. It yields different results when `t` is a value that would underflow when eleven hours is subtracted from it. For example, consider `t = '03:00'::TIME`: Original expression: t::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '03:00'::TIME + '-11 hrs'::INTERVAL > '01:00'::TIME => '16:00'::TIME > '01:00'::TIME => true Normalized expression: t::TIME > '12:00'::TIME => '03:00'::TIME > '12:00'::TIME => false These normalization rules are only valid with types where overflow or underflow during addition and subtraction results in an error. This commit restricts these normalization rules to only operate on integers, floats, and decimals, which will error if there is underflow or overflow. Fixes cockroachdb#90053 Release note (bug fix): A bug has been fixed that caused incorrect evaluation of comparison expressions involving time and interval types, like `col::TIME + '10 hrs'::INTERVAL' > '01:00'::TIME`.
opt: prevent overflow when normalizing comparison with constants
Previously, the
NormalizeCmpPlusConst,NormalizeCmpMinusConst, andNormalizeCmpMinusConstwould normalize expressions in the formvar +/- const1 ? const2andconst1 - var ? const2, whereconst1and
const2were const constants and?is any comparison operator, toattempt to isolate the variable on one side of the comparison. Below are
some examples of the transformations these rules made:
However, these transformations were invalid when the newly created
addition or subtraction operators overflowed. In the case of an
expression involving TIME and INTERVAL types, this could cause incorrect
query results. For example:
The first expression and last expression have different semantic meaning
because the addition of the constant TIME and constant INTERVAL
overflowed.
This commit prevents the rules from transforming expressions if they
would cause an overflow/underflow. In order to detect overflow, there
are new restrictions that prevent these rules from firing in some cases.
Notably, the datum on the RHS of the newly constructed +/- operator must
be an integer, float, decimal, or interval. Also, the result type of the
newly constructed +/- operator must be equivalent to the LHS of the
operator. Both of these restrictions are required in order to detect
overflow/underflow.
Informs #88128
Release note (bug fix): A bug has been fixed that caused incorrect
evaluation of expressions in the form
col +/- const1 ? const2, whereconst1andconst2are constant values and?is any comparisonoperator. The bug was caused by operator overflow when the optimizer
attempted to simplify these expressions to have a single constant value.
normalize: remove invalid normalizations
It was recently discovered that normalization rules in the optimizer
were invalid (see #88128). These rules were adapted from the heuristic
optimizer several years ago, and they still remain in the
normalizepackage even though the heuristic optimizer no longer exists. The
normalizepackage is still used to normalize expressions duringbackfilling, so the invalid rules can cause incorrect computed column
values and corrupt indexes. This commit removes these rules entirely.
Informs #88128
Release note: None