sql: incorporate lookupExpr cpu cost into cost model#66786
sql: incorporate lookupExpr cpu cost into cost model#66786craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
5530b84 to
0307b01
Compare
rytaft
left a comment
There was a problem hiding this comment.
Nice, this looks like it will be a good change. Glad to see it didn't change any plans.
nit: I don't think "sql" is a valid release note category. I'd call this "performance improvement". Also, I'd be a bit more verbose in the release note to explain how the change affects the end user.
Reviewed 2 of 2 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @cucaroach)
pkg/sql/opt/xform/coster.go, line 798 at r1 (raw file):
// If we have extra lookupExprs beyond whats in the on conditions add those // as an extra cpu cost since the joinreader machinery isn't free. extraLookupJoinConditions := join.LookupExpr.Difference(join.On)
Is this Difference necessary? I think the LookupExpr should already be disjoint from On.
pkg/sql/opt/xform/coster.go, line 817 at r1 (raw file):
flags memo.JoinFlags, localityOptimized bool, ) (memo.Cost, float64) {
I don't love that you need to return the lookupCount here. I would probably encapsulate the logic for the cost of the extra lookup conditions in a helper function that takes the join and lookupCount as parameters and returns the cost (if join is not a LookupJoinExpr or doesn't have LookupExpr set, just return 0). That would also allow us to add a cost for multiple lookupCols in the future, which might be worth doing.
But I don't feel too strongly... if you want to leave it as-is, just be sure to add a comment to the function explaining the new return param.
0307b01 to
a064047
Compare
cucaroach
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @rytaft)
pkg/sql/opt/xform/coster.go, line 798 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Is this
Differencenecessary? I think the LookupExpr should already be disjoint from On.
I swear I saw a case where lookupExpr was duping part of the on but I couldn't find it, I must have been duping myself.
pkg/sql/opt/xform/coster.go, line 817 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
I don't love that you need to return the
lookupCounthere. I would probably encapsulate the logic for the cost of the extra lookup conditions in a helper function that takes thejoinandlookupCountas parameters and returns the cost (ifjoinis not aLookupJoinExpror doesn't haveLookupExprset, just return 0). That would also allow us to add a cost for multiple lookupCols in the future, which might be worth doing.But I don't feel too strongly... if you want to leave it as-is, just be sure to add a comment to the function explaining the new return param.
Me neither but updating the cost in computeLookupJoinCost lets me avoid type cast shenanigans required if I did it from computeIndexLookupJoinCost and I thought the return complexity was the lesser of two evils. Happy to change it to whatever you think it more idiomatic.
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 2 of 2 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @cucaroach)
pkg/sql/opt/xform/coster.go, line 817 at r1 (raw file):
Previously, cucaroach (Tommy Reilly) wrote…
Me neither but updating the cost in computeLookupJoinCost lets me avoid type cast shenanigans required if I did it from computeIndexLookupJoinCost and I thought the return complexity was the lesser of two evils. Happy to change it to whatever you think it more idiomatic.
Yea, I still have a slight preference towards calling the function from inside computeIndexLookupJoinCost, even if it needs a cast inside. I think it's a bit more idiomatic in the opt code base, since we try hard to keep top level function interfaces as clean and consistent as possible.
But computeIndexLookupJoinCost is already a bit of a special snowflake, so as I said, I don't feel too strongly in this case. (But do add a comment explaining the return param if you leave it as-is.)
a064047 to
a3e40c8
Compare
cucaroach
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft)
pkg/sql/opt/xform/coster.go, line 817 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Yea, I still have a slight preference towards calling the function from inside
computeIndexLookupJoinCost, even if it needs a cast inside. I think it's a bit more idiomatic in theoptcode base, since we try hard to keep top level function interfaces as clean and consistent as possible.But
computeIndexLookupJoinCostis already a bit of a special snowflake, so as I said, I don't feel too strongly in this case. (But do add a comment explaining the return param if you leave it as-is.)
Okay I went with the type cast, definitely came out cleaner!
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach)
pkg/sql/opt/xform/coster.go, line 924 at r3 (raw file):
cost /= 2.5 } cost += lookupExprCost(join, lookupCount)
I think it would be better to instead make this function not depend on lookupCount (just have it return 0 or cpuCostFactor * memo.Cost(len(lookupExpr.LookupExpr))), and add the result to perLookupCost above.
a3e40c8 to
ea1e185
Compare
cucaroach
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rytaft)
pkg/sql/opt/xform/coster.go, line 924 at r3 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
I think it would be better to instead make this function not depend on
lookupCount(just have it return 0 orcpuCostFactor * memo.Cost(len(lookupExpr.LookupExpr))), and add the result toperLookupCostabove.
👍
Release note (performance improvement): adjust optimizer cost to include cpu cost of lookupExprs when used by lookup joins. When a lookup join is chosen we can use two strategies, a simple 1 column lookup or a more involved multi column lookup that makes more efficient use of indexes but has higher CPU cost. This change makes the cost model reflect that extra cost.
ea1e185 to
148a9ec
Compare
|
bors r=rytaft |
|
Build succeeded: |
Informs #51576
This is a prep-the-patient change required for enabling the lookup join inequality conditions. It enables for instance partial indexes to be selected when there's a costing tie between a lookup join with a partial index vs a full index with the partial index qualifier as a lookup join condition.
Release note (sql): improve cost model to include cpu cost of lookupExprs used by lookup joins.