opt: derive constant computed columns for index selection#43450
opt: derive constant computed columns for index selection#43450craig[bot] merged 2 commits intocockroachdb:masterfrom
Conversation
Isolate method movements in this commit so it makes subsequent commit easier to understand. Release note: None
RaduBerinde
left a comment
There was a problem hiding this comment.
Very cool functionality and the code is very clean!
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @justinj, @RaduBerinde, and @rytaft)
pkg/sql/opt/constraint/span.go, line 76 at r2 (raw file):
// the start key is the same as the end key, and both boundaries are inclusive. func (sp *Span) HasSingleKey(evalCtx *tree.EvalContext) bool { if sp.start.IsEmpty() || sp.end.IsEmpty() {
[nit] if we move the Length check first, we only need to check one of these
pkg/sql/opt/xform/custom_funcs.go, line 567 at r2 (raw file):
} var replace func(e opt.Expr) opt.Expr
Calling Replace when we don't end up folding to a constant can create a bunch of garbage. We could do these checks first using the OuterCols of the scalar expression, and only then call Replace (when we know for sure that all outer cols are in constCols)
RaduBerinde
left a comment
There was a problem hiding this comment.
There's some interaction between this and #43405 (we need to handle these new expressions). I'm happy to update my PR once this goes in.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @andy-kimball, @justinj, and @rytaft)
andy-kimball
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @justinj, @RaduBerinde, and @rytaft)
pkg/sql/opt/xform/custom_funcs.go, line 567 at r2 (raw file):
Previously, RaduBerinde wrote…
Calling
Replacewhen we don't end up folding to a constant can create a bunch of garbage. We could do these checks first using the OuterCols of the scalar expression, and only then callReplace(when we know for sure that all outer cols are inconstCols)
Checking outer columns is a bit of a pain in this context, because we don't have props for the expression; we'd need to walk the expression to gather them.
I'll point out that there's only unnecessary garbage in the case where we perform at least one variable replacement, but then can't fully fold the entire expression. I'd think that was an edge case, like if there are multiple variables, or if there is a side-effecting function. If we can fold at least one variable in the expression, then it's likely the entire expression can be folded (since vast majority of check expressions ref a single variable).
bdc43a5 to
fc4670c
Compare
|
pkg/sql/opt/xform/custom_funcs.go, line 567 at r2 (raw file): Previously, andy-kimball (Andy Kimball) wrote…
It will try to construct existing expressions, right? Last time I looked at the factory code, it seemed like it would allocate in that case as well (but maybe I'm wrong?) |
andy-kimball
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @justinj, @RaduBerinde, and @rytaft)
pkg/sql/opt/xform/custom_funcs.go, line 567 at r2 (raw file):
Previously, RaduBerinde wrote…
It will try to construct existing expressions, right? Last time I looked at the factory code, it seemed like it would allocate in that case as well (but maybe I'm wrong?)
The CopyAndReplace methods will first copy, then replace. The Replace methods will only copy the parent if at least one child has been replaced.
|
pkg/sql/opt/xform/custom_funcs.go, line 567 at r2 (raw file): Previously, andy-kimball (Andy Kimball) wrote…
Ah, nice. |
The optimizer uses explicitly specified filter constraints to qualify
available indexes during the exploration phase. It also uses implicit
filter constraints derived from table check constraints.
This commit adds new implicit filter constraints based on constant
computed columns. Constant computed columns are based on other columns in
the table that are constrained to be constant by other filters. For
example:
CREATE TABLE hashed (
k STRING,
hash INT AS (fnv32(k) % 4) STORED,
INDEX hash_index (hash, k)
)
SELECT * FROM hashed WHERE k = 'andy'
Here, the value of the hash column can be computed at query build time,
and therefore "hash_index" selected as the lowest cost index. The resulting
plan would be:
scan hashed@secondary
├── columns: k:1(string!null) hash:2(int)
├── constraint: /2/1/3: [/1/'andy' - /1/'andy']
└── fd: ()-->(1)
This improved ability to select indexes is useful for implementing HASH
indexes, which scatter keys across N buckets (see Issue cockroachdb#39340).
Release note (sql change): The optimizer can now derive constant computed
columns during index selection. This enables more efficient HASH indexes.
fc4670c to
2cfa5e8
Compare
|
bors r+ |
43450: opt: derive constant computed columns for index selection r=andy-kimball a=andy-kimball
The optimizer uses explicitly specified filter constraints to qualify
available indexes during the exploration phase. It also uses implicit
filter constraints derived from table check constraints.
This commit adds new implicit filter constraints based on constant
computed columns. Constant computed columns are based on other columns in
the table that are constrained to be constant by other filters. For
example:
CREATE TABLE hashed (
k STRING,
hash INT AS (fnv32(k) % 4) STORED,
INDEX hash_index (hash, k)
)
SELECT * FROM hashed WHERE k = 'andy'
Here, the value of the hash column can be computed at query build time,
and therefore "hash_index" selected as the lowest cost index. The resulting
plan would be:
scan hashed@secondary
├── columns: k:1(string!null) hash:2(int)
├── constraint: /2/1/3: [/1/'andy' - /1/'andy']
└── fd: ()-->(1)
This improved ability to select indexes is useful for implementing HASH
indexes, which scatter keys across N buckets (see Issue #39340).
Release note (sql change): The optimizer can now derive constant computed
columns during index selection. This enables more efficient HASH indexes.
Co-authored-by: Andrew Kimball <andyk@cockroachlabs.com>
Build succeeded |
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 1 of 1 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andy-kimball, @justinj, and @rytaft)
pkg/sql/opt/xform/custom_funcs.go, line 558 at r3 (raw file):
// ID into a constant value, by evaluating it with respect to a set of other // columns that are constant. If the computed column is constant, enter it into // the constCols map and return false. Otherwise, return false.
[nit] I think you meant "enter it into the constCols map and return true"
pkg/sql/opt/xform/testdata/rules/computed, line 102 at r3 (raw file):
└── (k_int = 2) OR (k_int = 3) [type=bool, outer=(1), constraints=(/1: [/2 - /2] [/3 - /3])] # Don't constrain the index for a NULL value.
Why not?
The optimizer can generate constrained scans over indexes on computed
columns when columns referenced in the computed column expression are
held constant. Consider this example:
CREATE TABLE t (a INT, v INT AS (a + 1) STORED, INDEX v_idx (v))
SELECT * FROM t WHERE a = 1
A constrained scan can be generated over `v_idx` because `v` depends on
`a` and the query filter holds `a` constant.
This commit lifts a restriction that prevented this optimization when
columns referenced in the computed column expression were held constant
to the `NULL` value. As far as I can tell, this restriction is not
necessary. In fact, @rytaft had questioned its purpose originally, but
the question was never answered:
cockroachdb#43450 (review)
By lifting this restriction, the optimizer can explore constrained scans
over both indexed computed columns with `IS NULL` expressions and
expression indexes with `IS NULL` expressions.
Fixes cockroachdb#83390
Release note (performance improvement): The optimizer now explores more
efficient query plans when index computed columns and expressions have
`IS NULL` expressions.
83619: opt: constrain expression indexes with IS NULL expressions r=mgartner a=mgartner
The optimizer can generate constrained scans over indexes on computed
columns when columns referenced in the computed column expression are
held constant. Consider this example:
CREATE TABLE t (a INT, v INT AS (a + 1) STORED, INDEX v_idx (v))
SELECT * FROM t WHERE a = 1
A constrained scan can be generated over `v_idx` because `v` depends on
`a` and the query filter holds `a` constant.
This commit lifts a restriction that prevented this optimization when
columns referenced in the computed column expression were held constant
to the `NULL` value. As far as I can tell, this restriction is not
necessary. In fact, `@rytaft` had questioned its purpose originally, but
the question was never answered:
#43450 (review)
By lifting this restriction, the optimizer can explore constrained scans
over both indexed computed columns with `IS NULL` expressions and
expression indexes with `IS NULL` expressions.
Fixes #83390
Release note (performance improvement): The optimizer now explores more
efficient query plans when index computed columns and expressions have
`IS NULL` expressions.
84084: bazel: new versions of prebuilt `c-deps` r=srosenberg a=rickystewart
Rebuild these archives to pull in
`52a3a0aa8a707f9bb03802186da0c60b715ed9ce` (change to `jemalloc` to
build without `MADV_FREE`).
Release note: None
84088: ui: fix alignment on custom scale r=maryliag a=maryliag
The check for valid options with the
removal of some options on #83229 didn't took
the custom values into consideration.
This commit add the option back, allowing the alignment
on custom values.
Release note (bug fix): Custom time period selection is now aligning
between Metrics and SQL Activity page.
84155: sql/schemachanger/scbuild: minor cleanup r=ajwerner a=ajwerner
Improves the error handling a tad to make runtime errors and assertion failure.
Fixes a typo.
Release note: None
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Ricky Stewart <ricky@cockroachlabs.com>
Co-authored-by: Marylia Gutierrez <marylia@cockroachlabs.com>
Co-authored-by: Andrew Werner <awerner32@gmail.com>
The optimizer uses explicitly specified filter constraints to qualify
available indexes during the exploration phase. It also uses implicit
filter constraints derived from table check constraints.
This commit adds new implicit filter constraints based on constant
computed columns. Constant computed columns are based on other columns in
the table that are constrained to be constant by other filters. For
example:
CREATE TABLE hashed (
k STRING,
hash INT AS (fnv32(k) % 4) STORED,
INDEX hash_index (hash, k)
)
SELECT * FROM hashed WHERE k = 'andy'
Here, the value of the hash column can be computed at query build time,
and therefore "hash_index" selected as the lowest cost index. The resulting
plan would be:
scan hashed@secondary
├── columns: k:1(string!null) hash:2(int)
├── constraint: /2/1/3: [/1/'andy' - /1/'andy']
└── fd: ()-->(1)
This improved ability to select indexes is useful for implementing HASH
indexes, which scatter keys across N buckets (see Issue #39340).
Release note (sql change): The optimizer can now derive constant computed
columns during index selection. This enables more efficient HASH indexes.