norm: do not fold floor division with unit denominator#87808
norm: do not fold floor division with unit denominator#87808craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
DrewKimball
left a comment
There was a problem hiding this comment.
Nice fix!
Could we make an exception for the case when the left argument is an integer? Since there wouldn't be any precision issues in that case. Also, can you add a test for // with both inputs constant to the FoldBinary tests to make sure it's still folded?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner)
a4f5062 to
6e24247
Compare
msirek
left a comment
There was a problem hiding this comment.
Done
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner)
mgartner
left a comment
There was a problem hiding this comment.
Reviewed 1 of 3 files at r1, 4 of 4 files at r2, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @msirek)
pkg/sql/opt/norm/general_funcs.go line 69 at r2 (raw file):
// IsIntFamily returns true if the given scalar expression is of one of the // integer types. func (c *CustomFuncs) IsIntFamily(scalar opt.ScalarExpr) bool {
nit: I'd just name this IsInt to match the other functions here
pkg/sql/opt/norm/rules/numeric.opt line 51 at r2 (raw file):
(Cast $left (BinaryType (OpName) $left $right)) # FoldFloorDivOne folds $left / 1 for integer types.
nit: / => //
pkg/sql/logictest/testdata/logic_test/operator line 52 at r2 (raw file):
statement ok CREATE TABLE table2 (col2 FLOAT8 NULL)
nit: it's common to name tables for regression tests after issue numbers, like t87605, which ensures they'll never conflict with other tables in the file.
pkg/sql/logictest/testdata/logic_test/operator line 63 at r2 (raw file):
---- 1.234567e+06 1.2345678901234e+13
I think pkg/sql/logictest/testdata/logic_test/float would be a better place for this test.
|
I think we should probably backport this to 21.2, 22.1, and 22.2.1 after a baking period. I've added the labels. |
6e24247 to
51f987a
Compare
msirek
left a comment
There was a problem hiding this comment.
OK
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
pkg/sql/opt/norm/general_funcs.go line 69 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: I'd just name this
IsIntto match the other functions here
Done
pkg/sql/opt/norm/rules/numeric.opt line 51 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit:
/=>//
Done
pkg/sql/logictest/testdata/logic_test/operator line 52 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: it's common to name tables for regression tests after issue numbers, like
t87605, which ensures they'll never conflict with other tables in the file.
renamed
pkg/sql/logictest/testdata/logic_test/operator line 63 at r2 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I think
pkg/sql/logictest/testdata/logic_test/floatwould be a better place for this test.
moved
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewed 1 of 4 files at r2, 5 of 5 files at r3, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
mgartner
left a comment
There was a problem hiding this comment.
I think we should probably backport this to 21.2, 22.1, and 22.2.1 after a baking period. I've added the labels.
Does this fix all known occurences of #86790? If so, I think it's safe to backport immediately to silence the nightly failures.
Also, I can revert #87840 if you think it's safe to do so now.
Reviewed 5 of 5 files at r3, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
The floor division operator `//` is like division but drops the fractional portion of the result. Normalization rule FoldDivOne rewrites `(column // 1)` as `(column)` which is incorrect if the data in `column` has fractional digits, as those digits should be dropped. The solution is to only fold `(column // 1)` when `column` is one of the integer types. Release note (bug fix): This patch fixes incorrect results from the floor division operator, `//`, when the numerator is non-constant and the denominator is the constant 1.
51f987a to
9c01a38
Compare
msirek
left a comment
There was a problem hiding this comment.
This doesn't fix #86790. To do so we'd have to extend this fix to cover not just FloorDiv, but Div too.
If we're doing just a CAST, the exponent is set to zero:
cockroach/pkg/sql/sem/eval/cast.go
Lines 324 to 325 in 655ae28
If the division is done, we set the exponent to -1 here:
https://github.com/cockroachdb/apd/blob/e2030eb9d16dcd55cc79147ae1a94be1f3c2037e/context.go#L364
I think we could fix it by making sure the exponent is 0 when the remainder is 0 (by setting adjCoeffs equal to adjExp10).
This would impact many things outside just this case.
It is probably safer just to extend this current fix to also cover Div (only fold the Div if the right side is an integer).
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
msirek
left a comment
There was a problem hiding this comment.
Actually, skipping folding of Div if the right side is an integer would not be sufficient. We'd have to have a special CAST operation to decimal which always uses a scale of 1, no matter the final precision.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner)
msirek
left a comment
There was a problem hiding this comment.
I added a benchmark for dividing int64 by constant int64(1). It is indeed slower than casting, so I guess we might not want to get rid of FoldDivOne, though I don't know how important it is to optimize divide by 1. I guess through constant folding we could have a constant expression which evaluates to 1, so maybe it's useful.
BenchmarkCastOp/useSel=true/hasNulls=true/int_to_decimal-32 123115 9526 ns/op 860.00 MB/s
BenchmarkCastOp/useSel=true/hasNulls=false/int_to_decimal-32 106650 10174 ns/op 805.15 MB/s
BenchmarkCastOp/useSel=false/hasNulls=true/int_to_decimal-32 67879 16494 ns/op 496.68 MB/s
BenchmarkCastOp/useSel=false/hasNulls=false/int_to_decimal-32 70771 16625 ns/op 492.75 MB/s
BenchmarkProjDivInt64ConstInt64Op/useSel=true/hasNulls=true/type=int-32 10000 101996 ns/op 80.32 MB/s
BenchmarkProjDivInt64ConstInt64Op/useSel=true/hasNulls=false/type=int-32 11337 109312 ns/op 74.94 MB/s
BenchmarkProjDivInt64ConstInt64Op/useSel=false/hasNulls=true/type=int-32 5575 189898 ns/op 43.14 MB/s
BenchmarkProjDivInt64ConstInt64Op/useSel=false/hasNulls=false/type=int-32 5844 202576 ns/op 40.44 MB/s
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball and @mgartner)
571616f to
5d1b5f5
Compare
Yeah, it might seem silly that someone would write something like this, but besides the case you mentioned of constant-folding, it can also get generated by tools like ORMs that users don't really have much control over. |
Yeah, I wasn't thinking of hand-written queries having this. Definitely it's from either ORMs or other app-generated queries which are not smart enough to write better SQL (unless they're doing it for some other reason like formatting purposes). I was just hoping for an easy fix. If it's too tough for a quick fix, maybe I won't try to handle it in this PR. |
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner, @msirek, and @yuzefovich)
pkg/sql/colexec/colexecbase/project_div_test.go line 31 at r4 (raw file):
) func BenchmarkProjDivInt64ConstInt64Op(b *testing.B) {
I think colexecproj.BenchmarkProjOp already does this, apart from using Decimal as the input types. If the type is important, it's probably better to just modify it instead of adding a new benchmark.
5d1b5f5 to
9c01a38
Compare
msirek
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @DrewKimball, @mgartner, and @yuzefovich)
pkg/sql/colexec/colexecbase/project_div_test.go line 31 at r4 (raw file):
Previously, DrewKimball (Drew Kimball) wrote…
I think
colexecproj.BenchmarkProjOpalready does this, apart from usingDecimalas the input types. If the type is important, it's probably better to just modify it instead of adding a new benchmark.
Sure. I found this command seems to test the same thing:
./dev bench pkg/sql/colexec/colexecproj -f BenchmarkProjOp/projDivInt64Int64Const --cpus 1 --count 5 --verbose --ignore-cache
I've removed the second commit and will merge the original version soon.
The FoldDivOne issue is handled through #88485.
DrewKimball
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball, @mgartner, and @yuzefovich)
msirek
left a comment
There was a problem hiding this comment.
TFTRs!
bors r+
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @mgartner)
|
Build succeeded: |
|
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 9c01a38 to blathers/backport-release-21.2-87808: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 21.2.x failed. See errors above. error creating merge commit from 9c01a38 to blathers/backport-release-22.1-87808: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1.x failed. See errors above. error creating merge commit from 9c01a38 to blathers/backport-release-22.2-87808: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.2.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
|
@msirek can you please open these backports manually since Blathers failed? Thanks! |
msirek
left a comment
There was a problem hiding this comment.
Done
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale)
Fixes #87605
The floor division operator
//is like division but drops thefractional portion of the result. Normalization rule FoldDivOne
rewrites
(column // 1)as(column)which is incorrect if thedata in
columnhas fractional digits, as those digits should bedropped.
The solution is to only fold
(column // 1)whencolumnis one ofthe integer types.
Release note (bug fix): This patch fixes incorrect results from
the floor division operator,
//, when the numerator is non-constantand the denominator is the constant 1.