opt: use transitive equalities to infer filters#43194
opt: use transitive equalities to infer filters#43194craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
cc @jordanlewis |
|
This is super cool! Do we have some benchmarks on perf improvements? Does this introduce any additional overhead that we should be concerned about? |
RaduBerinde
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @justinj, @RaduBerinde, and @rytaft)
pkg/sql/opt/norm/join.go, line 133 at r1 (raw file):
// equality conditions in filters that connect the two groups. If so, // canMapJoinOpEquivalenceGroup returns true. func (c *CustomFuncs) canMapJoinOpEquivalenceGroup(
[nit] consider expanding the comment to mention equivFD
pkg/sql/opt/norm/testdata/rules/join, line 1123 at r1 (raw file):
└── variable: k [type=int] exec-ddl
Would it be possible to come up with a simpler testcase, similar to the other ones?
jordanlewis
left a comment
There was a problem hiding this comment.
Awesome, thanks for turning this around in record time!
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @justinj and @rytaft)
Previously, the optimizer was not inferring all possible filters based on equality conditions. For example, consider a query such as: SELECT * FROM t JOIN u ON t.a = u.a JOIN v ON u.a = v.a JOIN w ON v.a = w.a WHERE t.a = 5; Ideally, the optimizer should infer additional filter conditions u.a = 5 AND v.a = 5 AND w.a = 5 based on the equality conditions. However, this was not occurring because the optimizer only considered equality conditions that were at the same level in the query tree as the filter to be mapped. Now the optimizer also considers equality conditions that are further down in the tree, so that additional filters can be inferred based on transitive equalities. Informs cockroachdb#43039 Release note (performance improvement): The optimizer can now infer additional filter conditions in some cases based on transitive equalities between columns.
64f0a90 to
0daebf1
Compare
rytaft
left a comment
There was a problem hiding this comment.
TFTR!
Do we have some benchmarks on perf improvements?
I'm running the new benchdiff tool now, but I'd be very surprised if this causes any significant changes. If anything it should improve planning performance slightly since I changed the code to avoid some redundant computation, but I doubt it will change many plans. In general we were already doing a pretty good job of inferring filters (note that no other test cases changed as a result of this fix).
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @justinj)
pkg/sql/opt/norm/join.go, line 133 at r1 (raw file):
Previously, RaduBerinde wrote…
[nit] consider expanding the comment to mention
equivFD
Done.
pkg/sql/opt/norm/testdata/rules/join, line 1123 at r1 (raw file):
Previously, RaduBerinde wrote…
Would it be possible to come up with a simpler testcase, similar to the other ones?
Done.
andy-kimball
left a comment
There was a problem hiding this comment.
Good to make it complete, since it can be frustrating to change a query (like adding a join) and then have performance inexplicably fall off a cliff.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @justinj)
|
Here are the results -- looks pretty good! (only showing tests with a pos or neg delta -- most of them didn't change): |
|
bors r+ |
Build failed (retrying...) |
Build failed (retrying...) |
RaduBerinde
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @justinj)
Build failed |
|
Those tests are pretty stellar. What do these tests do: Does KV need to signoff on any of this? |
|
No. "KV" refers to the type of workload (simple key-value table). Those tests show very large variations (e.g. ±18%), I wouldn't count on those differences meaning something. In fact, none of those tests are affected by this change (there are no joins). On a more general note, I don't think it's useful to arbitrarily discuss benchmarks on PRs. We (meaning the PR author as well as the teammates reviewing it) have been using our best judgement on when we should look at benchmarks (based on our knowledge of the code) and I think we should continue to do that. One can reasonably ask about benchmarks on almost any given PR, and yet I don't think it's reasonable to spend time on them on every PR. |
|
Yea, I'm not too worried about the tests that got slightly worse due to the large variation. The allocation benchmarks show much less variation than the time benchmarks, and those are uniformly good. (And as Radu mentioned, I'm pretty confident that none of my code changes add much overhead.) bors r+ |
43194: opt: use transitive equalities to infer filters r=rytaft a=rytaft Previously, the optimizer was not inferring all possible filters based on equality conditions. For example, consider a query such as: ``` SELECT * FROM t JOIN u ON t.a = u.a JOIN v ON u.a = v.a JOIN w ON v.a = w.a WHERE t.a = 5; ``` Ideally, the optimizer should infer additional filter conditions `u.a = 5 AND v.a = 5 AND w.a = 5` based on the equality conditions. However, this was not occurring because the optimizer only considered equality conditions that were at the same level in the query tree as the filter to be mapped. Now the optimizer also considers equality conditions that are further down in the tree, so that additional filters can be inferred based on transitive equalities. Informs #43039 Release note (performance improvement): The optimizer can now infer additional filter conditions in some cases based on transitive equalities between columns. 43218: backupccl: stop swallowing error in restoreResumer.OnFailOrCancel() r=ajwerner a=ajwerner In `restoreResumer.OnFailOrCancel()` we'd return nil when encountering a non-nil error in a call to `sqlbase.RemovePublicTableNamespaceEntry()`. Error handling in `jobs.Resumer.OnFailOrCancel()` is a tricky proposition but this behavior is certainly wrong. Release note: None Co-authored-by: Rebecca Taft <becca@cockroachlabs.com> Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>
|
Awesome--excited about this change |
Build succeeded |
Previously, the optimizer was not inferring all possible filters
based on equality conditions. For example, consider a query such
as:
Ideally, the optimizer should infer additional filter conditions
u.a = 5 AND v.a = 5 AND w.a = 5based on the equality conditions.However, this was not occurring because the optimizer only considered
equality conditions that were at the same level in the query tree as the
filter to be mapped. Now the optimizer also considers equality conditions
that are further down in the tree, so that additional filters can
be inferred based on transitive equalities.
Informs #43039
Release note (performance improvement): The optimizer can now infer
additional filter conditions in some cases based on transitive equalities
between columns.