opt: fix wrong results due to sorting 1st join of paired join#89840
opt: fix wrong results due to sorting 1st join of paired join#89840msirek wants to merge 2 commits intocockroachdb:masterfrom
Conversation
Fixes cockroachdb#89603 This fixes an issue where an illegal paired join plan is created by the optimizer where the first join in the pair is sorted. This breaks the assumption required in paired joiner logic that the first join in the pair is never sorted or distributed for proper interpretation of the continuation column: ``` // ContinuationCol is the column ID of the continuation column when // IsFirstJoinInPairedJoiner is true. The continuation column is a // boolean column that indicates whether an output row is a // continuation of a group corresponding to a single left input row. ContinuationCol opt.ColumnID ``` Function `lookupJoinCanProvideOrdering` allows the required ordering to be passed to the child of the first join in a paired join if only the input columns are projected (no lookup columns). This is not sufficient because only when the first join can further pass the required ordering on to its input can a sort be avoided. The solution is to modify `lookupJoinCanProvideOrdering` to indicate the second join in a paired join can provide an ordering if both the first join and the first join's child can provide the ordering. Release note (bug fix): This patch fixes possible, but rare, incorrect results from paired lookup joins where the first join in the pair is sorted.
fa0972b to
6b57200
Compare
msirek
left a comment
There was a problem hiding this comment.
OK, that should be my last push. Just making sure the sanity check works.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @mgartner)
This commit adds an assertion that a sort on the first join of a paired join is never enforced. Fixes cockroachdb#89603 Release note: None
DrewKimball
left a comment
There was a problem hiding this comment.
I'm not convinced this will actually solve the problem (or at least that the test will verify it does). We should probably use the memo format for the test. I'm suspicious of #84689 - I think maybe we need to add a !lookupJoin.IsSecondJoinInPairedJoiner check to the branch I added.
Reviewed 2 of 2 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @msirek, and @sumeerbhola)
pkg/sql/opt/xform/optimizer.go line 589 at r2 (raw file):
// Was a required physical property applied which is unenforcible on the group // member and might do something bad like require an illegal sort? o.assertEnforcibleProps(member, required)
Seems like the overhead for this could be non-negligible, should it be a test-only check?
pkg/sql/opt/xform/testdata/rules/join line 12944 at r1 (raw file):
# The sort must be done as the last step instead of in between the first and # second joins of the paired join. opt set=testing_optimizer_random_seed=4057832385546395198 set=testing_optimizer_cost_perturbation=1.0
[nit] could probably use a less verbose format here.
msirek
left a comment
There was a problem hiding this comment.
I tried reverting #84689, and the problem is still there, so I don't think it's related. Ideally there would be a way to not even optimize an enforcer. Right now we're just sort of relying on the sort operation to not get chosen because the input can provide the ordering. But as long as we're adding the enforcer to the memo, it might get chosen if costs are low. I was trying some other fixes to avoid optimizing the enforcer at all for the first join in the pair, but that seemed like a risky change.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @sumeerbhola)
|
How does the Also, this has me thinking we should add a |
|
What are the options for preventing this? Maybe it's too hacky, but one thing we could try would be to cost a sort on top of the first join in a paired join with |
msirek
left a comment
There was a problem hiding this comment.
It doesn't prevent it entirely I think. It just makes selection of the optimized expression without sort much more likely due to costs. Like I said above, I was trying a fix to avoid optimizing the SortExpr entirely, but it's risky. Maybe we could do something like what's done in Builder.buildDistribute where a NoOpDistribution is removed entirely. If we have a No-Op sort, would it be safe to remove it, I wonder? Are there any query plans created that rely on a sort taking place, and removing the sort could cause problems?
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @sumeerbhola)
msirek
left a comment
There was a problem hiding this comment.
I was considering the hugeCost solution, but we'd have to make sure that kicks in even when fuzzing costs and that fuzzed costs of other expressions are never greater than hugeCost. Yeah, I'm not sure if I like that or not, since there's still an illegal group member in the memo. I will take a look at the #84689 code too and see if we should try to play it safe with paired joins. Even if we disable some optimizations, we could always re-enable them later after more testing.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @sumeerbhola)
|
I think I'm leaning toward unifying paired lookup joins into one expression during optimization - that seems like it should solve problems of this nature for certain |
msirek
left a comment
There was a problem hiding this comment.
OK, I might let this ruminate a bit; looking at something else this afternoon.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @sumeerbhola)
|
Closing in favor of #92632 |
Fixes #89603
This fixes an issue where an illegal paired join plan is created by the
optimizer where the first join in the pair is sorted. This breaks the
assumption required in paired joiner logic that the first join in the
pair is never sorted or distributed for proper interpretation of the
continuation column:
Function
lookupJoinCanProvideOrderingallows the required ordering tobe passed to the child of the first join in a paired join if only the
input columns are projected (no lookup columns). This is not sufficient
because only when the first join can further pass the required ordering
on to its input can a sort be avoided.
The solution is to modify
lookupJoinCanProvideOrderingto indicate thesecond join in a paired join can provide an ordering if both the first
join and the first join's child can provide the ordering.
Release note (bug fix): This patch fixes possible, but rare, incorrect
results from paired lookup joins where the first join in the pair is
sorted.
xform: assertion for unenforcible props
This commit adds an assertion that a sort on the first join of a paired
join is never enforced.
Fixes #89603
Release note: None