sql: enable simple stats for all types#49801
sql: enable simple stats for all types#49801craig[bot] merged 1 commit intocockroachdb:masterfrom madelynnblue:inv-stats
Conversation
|
@yuzefovich can you review the change to Fingerprint? |
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mjibson, @rytaft, and @yuzefovich)
pkg/sql/sqlbase/encoded_datum.go, line 291 at r1 (raw file):
// case uses ed.Encode, which has a fast path if the encoded bytes are already // the right encoding. if MustBeValueEncoded(typ) {
I think @rohany made some recent changes here so that ArrayFamily would use ed.Encode method below although it "must be value encoded".
Nah, arrays have key encodings now. |
yuzefovich
left a comment
There was a problem hiding this comment.
Reviewed 1 of 4 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @mjibson and @rytaft)
pkg/sql/sqlbase/encoded_datum.go, line 291 at r1 (raw file):
Previously, yuzefovich wrote…
I think @rohany made some recent changes here so that
ArrayFamilywould useed.Encodemethod below although it "must be value encoded".
So I guess this change looks good, right Rohan?
|
yeah |
madelynnblue
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @rohany and @rytaft)
pkg/sql/sqlbase/encoded_datum.go, line 291 at r1 (raw file):
Previously, yuzefovich wrote…
So I guess this change looks good, right Rohan?
Arrays now only must value encode if their contents must:
cockroach/pkg/sql/sqlbase/structured.go
Line 1333 in 455a67b
|
Ok sql exec people. Here's a stumper. The Fingerprint change appears to have broken a few tests. is now returning 0 rows only in the no longer produces an error but is reported as success in the For example. cockroach/pkg/sql/sqlbase/structured.go Line 1342 in a18b32f and yet the above query for Questions:
|
|
I think there is some slight overloading of how |
|
Ok so maybe...we can use Fingerprint in more places now? I'll audit uses of EncodeTableKey. |
|
That sounds reasonable — encdatum.encode is probably useful to look at too. Most of the calls come from there. |
|
I think I see the issue: in |
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 2 of 4 files at r1, 2 of 2 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mjibson and @rohany)
pkg/sql/opt/exec/execbuilder/testdata/stats, line 265 at r2 (raw file):
query T EXPLAIN (OPT, VERBOSE) SELECT DISTINCT j FROM tj WHERE j IS NULL
I wonder if this test is going to be flaky... if auto stats are collected before this runs, we'll get different results. Maybe add this to the top of this file (since I think other tests in this file could also have a problem):
statement ok
SET CLUSTER SETTING sql.stats.automatic_collection.enabled = false
(I think auto stats are currently disabled on logic tests, but if we enable them we still want this to work)
|
This diff fixes the issue with row containers: The explanation is that |
|
That seems correct, but we will then error out for types that we can't key encode. We can't know in advance whether we are going to spill to disk here either. Why does the disk row container necessarily need ordering properties of the columns? |
|
yuzefovich
left a comment
There was a problem hiding this comment.
We merged the fix to the disk row container, so once you rebase, the build should be green.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @mjibson and @rohany)
madelynnblue
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @rohany and @rytaft)
pkg/sql/opt/exec/execbuilder/testdata/stats, line 265 at r2 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
I wonder if this test is going to be flaky... if auto stats are collected before this runs, we'll get different results. Maybe add this to the top of this file (since I think other tests in this file could also have a problem):
statement ok SET CLUSTER SETTING sql.stats.automatic_collection.enabled = false(I think auto stats are currently disabled on logic tests, but if we enable them we still want this to work)
Done.
rytaft
left a comment
There was a problem hiding this comment.
Reviewed 4 of 4 files at r3.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @rohany)
|
bors r+ |
Merge conflict (retrying...) |
Build failed |
|
bors r+ |
Build failed (retrying...) |
Canceled |
|
bors r+ |
Build failed |
Improve datum fingerprinting (used by rowexec/sampler.go) to work correctly for all types. Previously, due to the hardcoded assumption that JSON was the only non-key encodable type, it would error if presented a geo/geom type. Use MustBeValueEncoded when fingerprinting and determining if a type can create a histogram. This should be future proof if we add new types or teach types how to key encode. Fixes #35844 Informs #48219 Release note: None
|
bors r+ |
Build succeeded |
Improve datum fingerprinting (used by rowexec/sampler.go) to work
correctly for all types. Previously, due to the hardcoded assumption that
JSON was the only non-key encodable type, it would error if presented
a geo/geom type.
Use MustBeValueEncoded when fingerprinting and determining if a type can
create a histogram. This should be future proof if we add new types or
teach types how to key encode.
Fixes #35844
Informs #48219
Release note: None