opt: refactor and test lookup join key column and expr generation#79911
opt: refactor and test lookup join key column and expr generation#79911craig[bot] merged 8 commits intocockroachdb:masterfrom
Conversation
c9f03c8 to
bb70471
Compare
msirek
left a comment
There was a problem hiding this comment.
Excellent refactoring and new unit tests!
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
pkg/sql/opt/lookupjoin/constraint_builder.go, line 80 at r6 (raw file):
// return values: // // keyCols - A list of columns to be used as lookup join key columns.
nit: maybe indicate these are left table columns and that this is an ordered list, with columns corresponding the those in rightSideCols.
pkg/sql/opt/lookupjoin/constraint_builder_test.go, line 53 at r6 (raw file):
// Information about the left columns. // // - index=(<column> [asc|desc], ...)
Since this supports [asc|desc] ordering, maybe a test with a range condition on a descending column would be useful.
rytaft
left a comment
There was a problem hiding this comment.
Very nice!
Reviewed 1 of 1 files at r1, 5 of 5 files at r2, 1 of 1 files at r3, 8 of 8 files at r4, 3 of 3 files at r5, 5 of 5 files at r6, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner)
pkg/sql/opt/lookupjoin/constraint_builder.go, line 85 at r4 (raw file):
// both. If both keyCols and lookupExpr are nil, then the index cannot be used // for a lookup join. func (b *ConstraintBuilder) Build(
Any chance you can split this into two commits that will be squashed before merging? One that literally just moves code with no changes from the original lookup join code, and one that makes the remaining changes?
pkg/sql/opt/lookupjoin/constraint_builder.go, line 92 at r4 (raw file):
inputProjections memo.ProjectionsExpr, constFilters memo.FiltersExpr, rightSideCols opt.ColList,
Do you think it's worth packaging this stuff into a new struct, lookupjoin.Constraint, or something like that? You could also save that for a future PR...
pkg/sql/opt/xform/join_funcs.go, line 374 at r4 (raw file):
keyCols, lookupExpr, inputProjections, constFilters, rightSideCols := cb.Build(index, onFilters, optionalFilters) if len(keyCols) == 0 && len(lookupExpr) == 0 {
If you make a new struct for lookupjoin.Constraint, this check could become a method on that struct.
mgartner
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
pkg/sql/opt/lookupjoin/constraint_builder.go, line 85 at r4 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Any chance you can split this into two commits that will be squashed before merging? One that literally just moves code with no changes from the original lookup join code, and one that makes the remaining changes?
I really tried to make no changes to the logic here. The logic inside Build is the same as the logic removed in generateLookupJoinsImpl, except for referencing the ConstraintBuilder fields rather than CustomFuncs fields and some additional comments. There was an intertwined section of code that created a memo.LookupJoin that I did not move here. All the helper functions are the same, except for the receiver type. I'm not really sure how to move the logic in a way that makes few changes, but I know we've been bitten by moving code in the past. Did you have a specific idea in mind? I suppose I could try doing the work in these N commits:
- Move the core key column and lookupExpr logic into a function rather than a struct method into a separate file in the xform package. Move the helper function to the separate file too.
- Add the ConstraintBuilder struct, adding Init, Build, and turning the helpers into methods of that struct.
- Move the ConstraintBuilder to a separate package.
pkg/sql/opt/lookupjoin/constraint_builder.go, line 92 at r4 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Do you think it's worth packaging this stuff into a new struct,
lookupjoin.Constraint, or something like that? You could also save that for a future PR...
Yes, there's lots of refactoring I want to do here. The first step was to extract it before changing too much.
pkg/sql/opt/xform/join_funcs.go, line 374 at r4 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
If you make a new struct for
lookupjoin.Constraint, this check could become a method on that struct.
Good idea. I'll explore that in a future PR.
rytaft
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @mgartner)
pkg/sql/opt/lookupjoin/constraint_builder.go line 85 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I really tried to make no changes to the logic here. The logic inside
Buildis the same as the logic removed ingenerateLookupJoinsImpl, except for referencing the ConstraintBuilder fields rather than CustomFuncs fields and some additional comments. There was an intertwined section of code that created a memo.LookupJoin that I did not move here. All the helper functions are the same, except for the receiver type. I'm not really sure how to move the logic in a way that makes few changes, but I know we've been bitten by moving code in the past. Did you have a specific idea in mind? I suppose I could try doing the work in these N commits:
- Move the core key column and lookupExpr logic into a function rather than a struct method into a separate file in the xform package. Move the helper function to the separate file too.
- Add the ConstraintBuilder struct, adding Init, Build, and turning the helpers into methods of that struct.
- Move the ConstraintBuilder to a separate package.
I'm suggesting that one of the commits should be literally just copy-paste with no changes, so it will not even compile, and is only for the purpose of making reviewing easier -- it will need to be squashed with the following commit before merging.
I think this is only needed for cases like this where there are large chunks of code moving, and as a reviewer, you want to be able to see the small changes that were made.
Does that make sense?
…nEquality Previously, `CustomFuncs.findComputedColJoinEquality` used `CustomFuncs.OuterCols` to retrieve the outer columns of computed column expressions. `CustomFuncs.OuterCols` returns the cached outer columns in the expression if it is a `memo.ScalarPropsExpr`, and falls back to calculating the outer columns with `memo.BuildSharedProps` otherwise. Computed column expressions are never `memo.ScalarPropsExpr`s, so we use just use `memo.BuildSharedProps` directly. Release note: None
Release note: None
This commit makes a minor change to `generateLookupJoinsImpl`. Previously, equality filters were extracted from the original `ON` filters. Now they are extracted from filters that have been reduced by partial index implication. This has no effect on behavior because equality filters that reference columns in two tables cannot exist in partial index predicates, so they will never be eliminated during partial index implication. Release note: None
bb70471 to
f278dd3
Compare
This commit will be squashed before merging. It copies code verbatim from join_funcs.go to lookupjoin/constraint_builder.go to make it easy to review the changes made when moving code in the following commit.
f278dd3 to
3a54649
Compare
This commit adds a new `lookupjoin` package. Logic for determining the key columns and lookup expressions for lookup joins has been moved to `lookupJoin.ConstraintBuilder`. The code was moved with as few changes as possible, and the behavior does not change in any way. This move will make it easier to test this code in isolation in the future, and allow for further refactoring. Release note: None
This commit makes the lookupjoin.ConstraintBuilder API more general to make unit testing easier in a future commit. Release note: None
Release note: None
3a54649 to
ccaa88a
Compare
mgartner
left a comment
There was a problem hiding this comment.
Added a new lookupjoin.Constraint struct in the last commit.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @msirek and @rytaft)
pkg/sql/opt/lookupjoin/constraint_builder.go line 85 at r4 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
I'm suggesting that one of the commits should be literally just copy-paste with no changes, so it will not even compile, and is only for the purpose of making reviewing easier -- it will need to be squashed with the following commit before merging.
I think this is only needed for cases like this where there are large chunks of code moving, and as a reviewer, you want to be able to see the small changes that were made.
Does that make sense?
I see. I wasn't thinking of making a commit that doesn't compile. The diff between the SQUASH BEFORE MERGING commit and the opt: moves some lookup join generation logic to lookup join package commit should be an easier read now.
pkg/sql/opt/lookupjoin/constraint_builder.go line 80 at r6 (raw file):
Previously, msirek (Mark Sirek) wrote…
nit: maybe indicate these are left table columns and that this is an ordered list, with columns corresponding the those in rightSideCols.
Done.
pkg/sql/opt/lookupjoin/constraint_builder_test.go line 53 at r6 (raw file):
Previously, msirek (Mark Sirek) wrote…
Since this supports
[asc|desc]ordering, maybe a test with a range condition on a descending column would be useful.
I don't think it really matters for this lookup constraint code but I added one test for thoroughness.
ccaa88a to
82c05bd
Compare
cucaroach
left a comment
There was a problem hiding this comment.
Reviewed 16 of 16 files at r7, 5 of 5 files at r8, 1 of 1 files at r9, 2 of 2 files at r10, 9 of 9 files at r11, 3 of 3 files at r12, 6 of 6 files at r13, 1 of 3 files at r14.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach, @msirek, and @rytaft)
|
Friendly ping @rytaft. If it's still too hard to review or too risky of a move, let me know and I'll take another shot at it. |
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 16 of 16 files at r7, 4 of 5 files at r8, 9 of 9 files at r11, 3 of 3 files at r12, 6 of 6 files at r13, 3 of 3 files at r14, 1 of 1 files at r15, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @msirek)
pkg/sql/opt/lookupjoin/constraint_builder.go line 85 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I see. I wasn't thinking of making a commit that doesn't compile. The diff between the
SQUASH BEFORE MERGINGcommit and theopt: moves some lookup join generation logic to lookup join packagecommit should be an easier read now.
Thanks!
pkg/sql/opt/lookupjoin/constraint_builder.go line 41 at r15 (raw file):
// KeyCols is an ordered list of columns from the left side of the join to // be used as lookup join key columns. This list corresponds to the columns // in RightSideCols. It will be nil if LookupExpr is nil.
nit: if LookupExpr is non-nil
The `lookupjoin.Constraint` struct has been added to encapsulate multiple data structures that represent a strategy for constraining a lookup join. Release note: None
82c05bd to
6f3230e
Compare
mgartner
left a comment
There was a problem hiding this comment.
TFTRs!
bors r+
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @cucaroach, @msirek, and @rytaft)
pkg/sql/opt/lookupjoin/constraint_builder.go line 41 at r15 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: if LookupExpr is non-nil
Donne.
|
Build succeeded: |
opt: simplify fetching outer column in CustomFuncs.findComputedColJoinEquality
Previously,
CustomFuncs.findComputedColJoinEqualityusedCustomFuncs.OuterColsto retrieve the outer columns of computed columnexpressions.
CustomFuncs.OuterColsreturns the cached outer columns inthe expression if it is a
memo.ScalarPropsExpr, and falls back tocalculating the outer columns with
memo.BuildSharedPropsotherwise.Computed column expressions are never
memo.ScalarPropsExprs, so we usejust use
memo.BuildSharedPropsdirectly.Release note: None
opt: make RemapCols a method on Factory instead of CustomFuncs
Release note: None
opt: use partial-index-reduced filters when building lookup expressions
This commit makes a minor change to
generateLookupJoinsImpl.Previously, equality filters were extracted from the original
ONfilters. Now they are extracted from filters that have been reduced by
partial index implication. This has no effect on behavior because
equality filters that reference columns in two tables cannot exist in
partial index predicates, so they will never be eliminated during
partial index implication.
Release note: None
opt: moves some lookup join generation logic to lookup join package
This commit adds a new
lookupjoinpackage. Logic for determining thekey columns and lookup expressions for lookup joins has been moved to
lookupJoin.ConstraintBuilder. The code was moved with as few changesas possible, and the behavior does not change in any way. This move will
make it easier to test this code in isolation in the future, and allow
for further refactoring.
Release note: None
opt: generalize lookupjoin.ConstraintBuilder API
This commit makes the lookupjoin.ConstraintBuilder API more general to
make unit testing easier in a future commit.
Release note: None
opt: add data-driven tests for lookupjoin.ConstraintBuilder
Release note: None
opt: add lookupjoin.Constraint struct
The
lookupjoin.Constraintstruct has been added to encapsulatemultiple data structures that represent a strategy for constraining a
lookup join.
Release note: None