opt: zigzag scan hints#67737
Conversation
511a771 to
d3bf040
Compare
b55b262 to
b43877b
Compare
rytaft
left a comment
There was a problem hiding this comment.
Nice! The main feedback is that we actually want this "hint" to be a requirement. I guess the name "hint" is a bit misleading, but all the other index and join hints we support work this way: If a particular plan is hinted, we must use that plan or return an error if it's not possible.
Note that this is a breaking change, if customer has an index called zigzag ...
One option could be to change the hint to FORCE_ZIGZAG, which seems less likely to be used as a name
Reviewed 7 of 19 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @cucaroach and @RaduBerinde)
pkg/sql/opt/xform/coster.go, line 1063 at r1 (raw file):
cost += filterSetup if join.HintProvided {
This is a different kind of hint, and not what we'll want to use here. I think you'll actually want to update this section, to return hugeCost if we plan a scan when a zigzag join was hinted:
cockroach/pkg/sql/opt/xform/coster.go
Lines 606 to 610 in 148a9ec
Then you'll also want to return hugeCost from computeZigzagJoinCost if the hinted indexes are not used.
You'll probably also need to update the code in execbuilder to return an error if the best plan used a scan when a zigzag join was forced.
Basically, if a user uses the ZIGZAG join flag, we want to force that plan, and return an error if it's not possible to plan the zig zag join.
pkg/sql/sem/tree/select.go, line 291 at r1 (raw file):
IgnoreUniqueWithoutIndexKeys bool // HintZigzag enables costing hints to prefer a zigzag join. HintZigzag bool
I think we want this to be ForceZigzag
f5e3c51 to
263eed9
Compare
|
RFAL! I went a little sideways to try to prevent increasing memo sizes all over the place, let me know if its too weird. I think I got the semantics @rytaft was after working but not sure if there's any other ways a scan can be planned that I need to error from if zigzag was forced. |
RaduBerinde
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @cucaroach and @rytaft)
pkg/sql/opt/memo/expr.go, line 358 at r2 (raw file):
Direction tree.Direction // index is a uint32 to allow optimal packing of this structure, use Index()
You can consider making flags a pointer in ScanPrivate instead. You can have a singleton instance with zero flags that is used when there are no flags (the common case). The same flags structure can be shared between multiple Scans (eg in
pkg/sql/opt/memo/expr.go, line 370 at r2 (raw file):
// For cases 2 and 3 if we can't generate a suitable plan an error will // result. ZigzagIndices *util.FastIntSet
Consider having a ForceZigzag flag here as well instead of the pointer (if it's next to the other bools, it shouldn't increase the struct size).
pkg/sql/sem/tree/select.go, line 293 at r2 (raw file):
// There are three possibilities: // 1) nil means no hint, business as usual // 2) empty slice means force a Zigzag join but any index will do
Differentiating between nil and empty slice is pretty error prone. I'd consider adding a ForceZigZag bool flag.
cucaroach
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @rytaft)
pkg/sql/opt/memo/expr.go, line 358 at r2 (raw file):
Previously, RaduBerinde wrote…
You can consider making
flagsa pointer inScanPrivateinstead. You can have a singleton instance with zero flags that is used when there are no flags (the common case). The same flags structure can be shared between multiple Scans (eg in
Great idea, that will affect the memo memory usage sizes in a good way no doubt.
pkg/sql/opt/memo/expr.go, line 370 at r2 (raw file):
Previously, RaduBerinde wrote…
Consider having a ForceZigzag flag here as well instead of the pointer (if it's next to the other bools, it shouldn't increase the struct size).
👍
pkg/sql/opt/xform/coster.go, line 1063 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
This is a different kind of hint, and not what we'll want to use here. I think you'll actually want to update this section, to return
hugeCostif we plan a scan when a zigzag join was hinted:.cockroach/pkg/sql/opt/xform/coster.go
Lines 606 to 610 in 148a9ec
Then you'll also want to return
hugeCostfromcomputeZigzagJoinCostif the hinted indexes are not used.You'll probably also need to update the code in
execbuilderto return an error if the best plan used a scan when a zigzag join was forced.Basically, if a user uses the ZIGZAG join flag, we want to force that plan, and return an error if it's not possible to plan the zig zag join.
Done.
pkg/sql/sem/tree/select.go, line 291 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
I think we want this to be
ForceZigzag
Done.
pkg/sql/sem/tree/select.go, line 293 at r2 (raw file):
Previously, RaduBerinde wrote…
Differentiating between nil and empty slice is pretty error prone. I'd consider adding a
ForceZigZag boolflag.
👍 It felt dirty
rytaft
left a comment
There was a problem hiding this comment.
Nice! once you address the feedback from @RaduBerinde
Reviewed 2 of 19 files at r1, 21 of 21 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @cucaroach and @RaduBerinde)
pkg/sql/opt/xform/select_funcs.go, line 1087 at r2 (raw file):
indices := util.MakeFastIntSet(leftIndex.Ordinal(), rightIndex.Ordinal()) forceIndices := scanPrivate.Flags.ZigzagIndices if !forceIndices.Intersection(indices).Equals(*forceIndices) {
Isn't this equivalent to if !forceIndices.SubsetOf(indices)?
11ae424 to
9d0e85c
Compare
fafb163 to
42c44fe
Compare
cucaroach
left a comment
There was a problem hiding this comment.
RFAL, I kiboshed any attempts at memory savings, if we feel strongly about that we can address with a separate PR. I also added some sql parser tests and dealt with inverted zigzag joins.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @RaduBerinde, and @rytaft)
pkg/sql/opt/xform/testdata/rules/join, line 8643 at r4 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: these tests should go in xform/testdata/rules/select with the other zigzag join tests
Done
mgartner
left a comment
There was a problem hiding this comment.
Reviewed 28 of 28 files at r5.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach, @mgartner, @RaduBerinde, and @rytaft)
pkg/sql/logictest/testdata/logic_test/zigzag_join, line 203 at r5 (raw file):
statement error syntax error SELECT * FROM a@{FORCE_ZIGZAG=foo,bar} WHERE a = 3 AND b = 7;
nit: these syntax tests probably aren't necessary here if you have these cases in pkg/sql/parser/testdata tests.
pkg/sql/logictest/testdata/logic_test/zigzag_join, line 220 at r5 (raw file):
# Basic form w/o indices. statement ok SELECT * FROM a@{FORCE_ZIGZAG} WHERE a = 3 AND b = 7;
nit: add a test where a zig zag can't be planned like SELECT * FROM a@{FORCE_ZIGZAG} WHERE a = 3 AND c = 'foo'
pkg/sql/opt/memo/expr_format.go, line 430 at r5 (raw file):
b.WriteString(string(idx.Name())) if _, needComma := s.Next(i + 1); needComma { b.WriteByte(',')
nit: rather than looking ahead you could print the comma before the index name for each index except the first.
needComma := false
for ... {
if needComma {
b.WriteByte(',')
}
b.WriteString(...)
needComma = true
}
pkg/sql/opt/memo/testdata/format, line 352 at r5 (raw file):
# Verify naked hint. opt SELECT * FROM zigzag@{FORCE_ZIGZAG} WHERE a = 3 AND b = 7;
There's no force_zigzag printed because they are only printed for scan expressions. To test your changes in expr_format.go you need scans to show up in the plan. You could either remove the secondary indexes from the zigzag table or use the norm directive instead of the opt directive so that there is no exploration performed (or both!).
pkg/sql/opt/optbuilder/select.go, line 540 at r5 (raw file):
private.Flags.ForceZigzag = indexFlags.ForceZigzag if indexFlags.ZigzagIndices != nil { private.Flags.ForceZigzag = true
Would it be simpler if the parser set indexFlags.ForceZigzag to true in the FORCE_ZIGZAG=foo case? Then the fields in these flags map directly to the fields in the AST struct.
pkg/sql/opt/xform/select_funcs.go, line 1093 at r5 (raw file):
forceIndices := scanPrivate.Flags.ZigzagIndices if !forceIndices.SubsetOf(indices) { return
Can we move this into 2 checks at the top of the ForEach and ForEachStartingAfter callbacks to avoid unnecessary work?
pkg/sql/opt/xform/select_funcs.go, line 1447 at r5 (raw file):
if !scanPrivate.Flags.ZigzagIndices.Empty() && !scanPrivate.Flags.ZigzagIndices.Contains(index.Ordinal()) { return
Can we put this at the top of this ForEach callback to avoid unnecessary work in this case?
pkg/sql/opt/xform/testdata/coster/join, line 998 at r5 (raw file):
---- # Verify huge cost with bogus hint.
nit: make this a little more clear like: "Verify that a huge cost is added to a scan when there is a zigzag join hint"
pkg/sql/opt/xform/testdata/rules/join, line 8643 at r4 (raw file):
Previously, cucaroach (Tommy Reilly) wrote…
Done
Looks like these tests still need to be moved to pkg/sql/opt/xform/testdata/rules/select. We have zigzag join tests there because the rules for generating zigzag joins operate on Select expressions, not on joins.
pkg/sql/parser/sql.y, line 9577 at r5 (raw file):
} | FORCE_ZIGZAG '=' index_name
I think we should be consistent with FORCE_INDEX and handle a syntax like FORCE_ZIGZAG '=' '[' iconst64 ']' which allows forcing a specific index by its descriptor ID. There's some parser tests in pkg/sql/parser/testdata/table_by_id which you can add on to.
pkg/sql/parser/testdata/select_clauses, line 424 at r5 (raw file):
at or near "}": syntax error: FORCE_ZIGZAG specified multiple times DETAIL: source SQL: SELECT a FROM foo@{FORCE_ZIGZAG,FORCE_ZIGZAG}
What is expected with foo@{FORCE_ZIGZAG,FORCE_ZIGZAG=a}? An error? Can you add a test for that case and maybe foo@{FORCE_ZIGZAG=a,FORCE_ZIGZAG} as well.
pkg/sql/sem/tree/select.go, line 293 at r5 (raw file):
// index constraints. IgnoreUniqueWithoutIndexKeys bool ForceZigzag bool
nit: Add a comment explaining what this does and how it relates to ZigzagIndices.
pkg/sql/sem/tree/select.go, line 296 at r5 (raw file):
// ZigzagIndices forces a zigzag plan, with (optionally) particular indices. // There are three possibilities: // 1) nil means no hint, business as usual
nit: update this comment, a nil and empty slice have the same meaning now, correct?
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 28 of 28 files at r5.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach, @mgartner, and @RaduBerinde)
pkg/sql/opt/memo/expr.go, line 354 at r4 (raw file):
Previously, cucaroach (Tommy Reilly) wrote…
Yeah, I should error out if any of these are combined, nice catch.
Now that I added NoZigzagJoin, you should also add a test to make sure that it's not combined with ForceZigzag. Sorry to give you more work!
pkg/sql/opt/memo/expr_format.go, line 413 at r3 (raw file):
Previously, cucaroach (Tommy Reilly) wrote…
This code is only reached for ScanExpr so that will never happen right? I could print if below for ZigzagExprs, I'd just have to dupe the ScanFlags from ScanPrivate into ZigzagJoinPrivate, should I do that?
I think it will be printed here in the case where you tried to force a zigzag but it was not successful (looks like you have a test that confirms this).
That's not a bad idea to add the flags to ZigzagJoinPrivate... although if it's only for the purpose of printing this stuff then maybe it's overkill. Up to you...
pkg/sql/opt/xform/testdata/rules/select, line 6387 at r5 (raw file):
└── j @> '{"a": 1, "b": 2}' # Generate a zigzag join on with a hint.
nit: on with a hint -> with a hint
pkg/sql/sem/tree/select.go, line 374 at r5 (raw file):
} if ih.ForceZigzag && ih.NoIndexJoin { return errors.New("FORCE_ZIGZAG cannot be specified in conjunction with NO_INDEX_JOIN")
also add a similar check for NoZigzagJoin
42c44fe to
38a6dd3
Compare
38a6dd3 to
d608856
Compare
cucaroach
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @RaduBerinde, and @rytaft)
pkg/sql/logictest/testdata/logic_test/zigzag_join, line 220 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: add a test where a zig zag can't be planned like
SELECT * FROM a@{FORCE_ZIGZAG} WHERE a = 3 AND c = 'foo'
That's a valid query from what I can tell, did you mean if like c didn't have an index?
pkg/sql/opt/memo/expr.go, line 354 at r4 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Now that I added NoZigzagJoin, you should also add a test to make sure that it's not combined with ForceZigzag. Sorry to give you more work!
No problem!
pkg/sql/opt/memo/expr_format.go, line 413 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
I think it will be printed here in the case where you tried to force a zigzag but it was not successful (looks like you have a test that confirms this).
That's not a bad idea to add the flags to ZigzagJoinPrivate... although if it's only for the purpose of printing this stuff then maybe it's overkill. Up to you...
Seems like overkill.
pkg/sql/opt/memo/expr_format.go, line 430 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: rather than looking ahead you could print the comma before the index name for each index except the first.
needComma := false for ... { if needComma { b.WriteByte(',') } b.WriteString(...) needComma = true }
👍
pkg/sql/opt/memo/testdata/format, line 352 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
There's no force_zigzag printed because they are only printed for scan expressions. To test your changes in
expr_format.goyou need scans to show up in the plan. You could either remove the secondary indexes from thezigzagtable or use thenormdirective instead of theoptdirective so that there is no exploration performed (or both!).
Ah, was wondering how that code would get triggered. Should I care that the ZigzagJoinExpr doesn't print out anything related to the hint?
pkg/sql/opt/optbuilder/select.go, line 540 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Would it be simpler if the parser set
indexFlags.ForceZigzagto true in theFORCE_ZIGZAG=foocase? Then the fields in these flags map directly to the fields in the AST struct.
Good thought but doesn't really help. I'm treating FORCE_ZIGZAG,FORCE_ZIGZAG as an error and also setting ForceZigzag to true would complicate that error handling.
pkg/sql/opt/xform/select_funcs.go, line 1093 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Can we move this into 2 checks at the top of the
ForEachandForEachStartingAftercallbacks to avoid unnecessary work?
👍
pkg/sql/opt/xform/select_funcs.go, line 1447 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Can we put this at the top of this
ForEachcallback to avoid unnecessary work in this case?
👍
pkg/sql/opt/xform/testdata/rules/join, line 8643 at r4 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Looks like these tests still need to be moved to
pkg/sql/opt/xform/testdata/rules/select. We have zigzag join tests there because the rules for generating zigzag joins operate on Select expressions, not on joins.
Boo, I moved these but forgot to push, sorry!.
pkg/sql/opt/xform/testdata/rules/select, line 6387 at r5 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: on with a hint -> with a hint
👍
pkg/sql/parser/sql.y, line 9577 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
I think we should be consistent with
FORCE_INDEXand handle a syntax likeFORCE_ZIGZAG '=' '[' iconst64 ']'which allows forcing a specific index by its descriptor ID. There's some parser tests inpkg/sql/parser/testdata/table_by_idwhich you can add on to.
👍
pkg/sql/parser/testdata/select_clauses, line 424 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
What is expected with
foo@{FORCE_ZIGZAG,FORCE_ZIGZAG=a}? An error? Can you add a test for that case and maybefoo@{FORCE_ZIGZAG=a,FORCE_ZIGZAG}as well.
Yeah I think it should be a "duplicate hint" kind of error.
pkg/sql/sem/tree/select.go, line 293 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: Add a comment explaining what this does and how it relates to
ZigzagIndices.
👍
pkg/sql/sem/tree/select.go, line 374 at r5 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
also add a similar check for NoZigzagJoin
👍
pkg/sql/opt/xform/testdata/coster/join, line 998 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: make this a little more clear like: "Verify that a huge cost is added to a scan when there is a zigzag join hint"
👍
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 16 of 25 files at r6, 6 of 6 files at r7.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @RaduBerinde, and @rytaft)
d608856 to
ae7a6ce
Compare
cucaroach
left a comment
There was a problem hiding this comment.
Thanks for the reviews! We've decided not to put this in 21.2, I'll merge it in post branch for 22.1 once Marcus is back and can check it out.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @RaduBerinde, and @rytaft)
ae7a6ce to
4063fe5
Compare
cucaroach
left a comment
There was a problem hiding this comment.
Rebased and updated commit message, I think this is ready to go, @mgartner did I address all your concerns?
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @rytaft)
pkg/sql/logictest/testdata/logic_test/zigzag_join, line 220 at r5 (raw file):
Previously, cucaroach (Tommy Reilly) wrote…
That's a valid query from what I can tell, did you mean if like c didn't have an index?
Nevermind, I was confused. Done.
mgartner
left a comment
There was a problem hiding this comment.
Reviewed 16 of 25 files at r6, 6 of 6 files at r7.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach, @mgartner, and @rytaft)
pkg/sql/opt/memo/testdata/format, line 352 at r5 (raw file):
Previously, cucaroach (Tommy Reilly) wrote…
Ah, was wondering how that code would get triggered. Should I care that the ZigzagJoinExpr doesn't print out anything related to the hint?
It might be a bit confusing, but I don't see it as a huge problem. I think its fine to leave as-is for now.
pkg/sql/opt/xform/select_funcs.go, line 1065 at r9 (raw file):
forceIndexes := scanPrivate.Flags.ZigzagIndexes if !forceIndexes.SubsetOf(indexes) { return
nit: I think this could still go up higher to avoid unnecessary work, right below:
iter2.ForEachStartingAfter(leftIndex.Ordinal(), func(rightIndex cat.Index, innerFilters memo.FiltersExpr, rightCols opt.ColSet, _ bool, _ memo.ProjectionsExpr) {
pkg/sql/opt/xform/select_funcs.go, line 1369 at r9 (raw file):
// Check if we have zigzag hints. if !scanPrivate.Flags.ZigzagIndexes.Empty() && !scanPrivate.Flags.ZigzagIndexes.Contains(index.Ordinal()) {
nit: I think this could still go up higher to avoid unnecessary work, right below:
iter.ForEach(func(index cat.Index, filters memo.FiltersExpr, indexCols opt.ColSet, _ bool, _ memo.ProjectionsExpr) {
pkg/sql/opt/xform/testdata/rules/select, line 6072 at r9 (raw file):
# Zigzag join hinting tests exec-ddl CREATE TABLE zigzag (n INT PRIMARY KEY, a INT, b INT, c STRING, INDEX a_idx(a), INDEX b_idx(b), UNIQUE INDEX c_idx(b,c), INDEX ba_idx(b));
nit: put this on multiple lines
Also, is ba_idx suppose to index (b, a)?
4063fe5 to
fdbda44
Compare
mgartner
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r10.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @rytaft)
fdbda44 to
1525cce
Compare
cucaroach
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @rytaft)
pkg/sql/opt/memo/testdata/format, line 352 at r5 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
It might be a bit confusing, but I don't see it as a huge problem. I think its fine to leave as-is for now.
👍
pkg/sql/opt/xform/testdata/rules/select, line 6072 at r9 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: put this on multiple lines
Also, is
ba_idxsuppose to index(b, a)?
I think I had it as b,a but then it didn't use zigzag, so I changed it to back to b. But I think relying on duplicate indexes like that isn't good so I made it b,c.
e1ace59 to
22f3a55
Compare
Fixes cockroachdb#35814 Enable table@{FORCE_ZIGZAG} and table@{FORCE_ZIGZAG=index} hints to force zigzag plans to be used. Generate an error if a zigzag can not be planned. Note that this is a breaking change, if customer has an index called FORCE_ZIGZAG and is hinting with table@{FORCE_ZIGZAG} that will no longer do what it did before. Release note (sql change): Extend index scan hint to allow zigzag joins to be forced.
22f3a55 to
fc1ff52
Compare
|
Thanks for all the reviews! |
|
Build succeeded: |
Fixes #35814
Enable table@{FORCE_ZIGZAG} and table@{FORCE_ZIGZAG=index} hints to lower the cost
of zigzag plans to that they will be preferred over other plans.
Note that this is a breaking change, if customer has an index called
FORCE_ZIGZAG and is hinting with table@{FORCE_ZIGZAG} that will no longer do what
it did before. Not sure what the ramifications of that are.
Release note (sql change): Extend index scan hint to allow zigzag joins to be preferred.