sql: add fullStatisticsID column to system.table_statistics#93751
sql: add fullStatisticsID column to system.table_statistics#93751craig[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
0c2cca0 to
8d6d4f3
Compare
michae2
left a comment
There was a problem hiding this comment.
Thank you for staying up late to do this! I really appreciate that you worked hard to make this feature extra safe. This will definitely help me sleep better at night when I'm on call. 🙂
Reviewed 12 of 16 files at r1, 1 of 1 files at r2, 6 of 6 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @faizaanmadhani, @mgartner, and @rytaft)
pkg/sql/alter_table.go line 1352 at r3 (raw file):
histogram, "partialPredicate", "fullStatisticsID"
nit: Indentation is off here.
pkg/sql/show_stats.go line 53 at r3 (raw file):
{Name: "avg_size", Typ: types.Int}, {Name: "partial_predicate", Typ: types.String}, {Name: "full_statistics_id", Typ: types.Int},
bikeshedding: In SHOW STATISTICS output instead of "statisticID" we call it "histogram_id", so maybe this should be "full_histogram_id" or "parent_histogram_id" or something like that. And it might be nice to put it after histogram_id.
pkg/sql/show_stats.go line 148 at r3 (raw file):
avgSizeIdx partialPredicateIdx fullStatisticsIdx
micro-nit: fullStatisticsIDIdx
pkg/sql/catalog/systemschema/system.go line 268 at r3 (raw file):
"avgSize" INT8 NOT NULL DEFAULT 0, "partialPredicate" STRING, "fullStatisticsID" INT8,
suggestion: If we're going to use PartialPredicate = "" and FullStatisticID = 0 for full statistics in various places, then it might be simpler to make these columns NOT NULL and use those sentinel values on disk, too.
pkg/sql/stats/new_stat.go line 98 at r3 (raw file):
if !settings.Version.IsActive(ctx, clusterversion.V23_1AddPartialStatisticsColumns) { _, err := executor.Exec(
While we're here, could you add a check that returns an error if partialPredicate != "" (as described in this comment)?
pkg/sql/stats/new_stat.go line 141 at r3 (raw file):
histogram, "partialPredicate", "fullStatisticsID"
nit: Indentation is off.
pkg/upgrade/upgrades/upgrades.go line 210 at r3 (raw file):
descIDSequenceForSystemTenant, ), upgrade.NewTenantUpgrade("add a partial predicate column to system.table_statistics",
nit: Would be nice to also change this message.
pkg/upgrade/upgrades/alter_table_statistics_partial_predicate_and_id.go line 26 at r3 (raw file):
ADD COLUMN IF NOT EXISTS "partialPredicate" STRING, ADD COLUMN IF NOT EXISTS "fullStatisticsID" INT8 FAMILY "fam_0_tableID_statisticID_name_columnIDs_createdAt_rowCount_distinctCount_nullCount_histogram"`
nit: I don't think we need this FAMILY part. (This table only has one column family, which these columns will be added to automatically.)
Code quote:
fam_0_tableID_statisticID_name_columnIDs_createdAt_rowCount_distinctCount_nullCount_histogram
mgartner
left a comment
There was a problem hiding this comment.
Reviewed 7 of 16 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @faizaanmadhani and @rytaft)
pkg/sql/stats/table_statistic.proto line 65 at r3 (raw file):
// that it was created from. It is 0 for full statistics which will be // NULL when stored in system.table_statistics. uint64 full_statistics_id = 12 [(gogoproto.customname) = "FullStatisticsID", (gogoproto.nullable) = true];
So if this will be 0 for full statistics, should (gogoproto.nullable) be false?
michae2
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @faizaanmadhani and @rytaft)
pkg/sql/catalog/systemschema/system.go line 268 at r3 (raw file):
Previously, michae2 (Michael Erickson) wrote…
suggestion: If we're going to use
PartialPredicate = ""andFullStatisticID = 0for full statistics in various places, then it might be simpler to make these columnsNOT NULLand use those sentinel values on disk, too.
One more bikeshed / nit: the other column is singular ("statisticID" rather than "statisticsID") so could we make this new column singular, to match ("fullStatisticID")?
8d6d4f3 to
dc73ace
Compare
faizaanmadhani
left a comment
There was a problem hiding this comment.
Thanks for the reviews! I've addressed all the comments.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @rytaft)
pkg/sql/alter_table.go line 1352 at r3 (raw file):
Previously, michae2 (Michael Erickson) wrote…
nit: Indentation is off here.
Fixed! Sorry this keeps happening -- i don't actually see the incorrect indentation in Goland so it's almost guesswork to try to align it correctly.
pkg/sql/show_stats.go line 53 at r3 (raw file):
Previously, michae2 (Michael Erickson) wrote…
bikeshedding: In
SHOW STATISTICSoutput instead of "statisticID" we call it "histogram_id", so maybe this should be "full_histogram_id" or "parent_histogram_id" or something like that. And it might be nice to put it after histogram_id.
Done.
pkg/sql/show_stats.go line 148 at r3 (raw file):
Previously, michae2 (Michael Erickson) wrote…
micro-nit:
fullStatisticsIDIdx
Done.
pkg/sql/catalog/systemschema/system.go line 268 at r3 (raw file):
Previously, michae2 (Michael Erickson) wrote…
One more bikeshed / nit: the other column is singular ("statisticID" rather than "statisticsID") so could we make this new column singular, to match ("fullStatisticID")?
I still think it would be better to keep the columns as NULL. It's not very complicated to translate between a tree.DNull datum and the value that we need and we do it consistently for other nullable fields (notably histograms -- I think histogram_id is NULL for multi-column statistics but an id for single column statistics, even though no histogram_id would ever be 0). Additionally, from the users perspective, it makes more sense that the column is NULL rather than 0 because 0 may indicate it references to a full statistic even though no statisticID can be 0.
pkg/sql/stats/new_stat.go line 98 at r3 (raw file):
Previously, michae2 (Michael Erickson) wrote…
While we're here, could you add a check that returns an error if
partialPredicate != ""(as described in this comment)?
Done.
pkg/sql/stats/table_statistic.proto line 65 at r3 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
So if this will be
0for full statistics, should(gogoproto.nullable)befalse?
So, we can't make it true because doing so throws this error when running ./dev gen protobuf.
ERROR: field TableStatisticProto.FullStatisticsId is a native type and in proto3 syntax with nullable=false there exists conflicting implementations when encoding zero values--gogoroach_out: protoc-gen-gogoroach: Plugin failed with status code 1.
2022/12/16 15:38:53 error running protoc: exit status 1
However, I removed them and I think this is just true by default because it didn't change the resulting generated go file.
michae2
left a comment
There was a problem hiding this comment.
So close! This is looking great!
Reviewed 6 of 13 files at r4, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @faizaanmadhani, @mgartner, and @rytaft)
pkg/sql/alter_table.go line 1352 at r3 (raw file):
Previously, faizaanmadhani (Faizaan Madhani) wrote…
Fixed! Sorry this keeps happening -- i don't actually see the incorrect indentation in Goland so it's almost guesswork to try to align it correctly.
No big deal 🙂 (There might be a Goland setting to highlight tabs vs spaces?)
pkg/sql/alter_table.go line 1364 at r4 (raw file):
histogram, predicateValue, s.FullStatisticsID,
Will this sometimes be 0? Do we need to do the same interface{} stuff we're doing above for the predicate to insert NULL?
pkg/sql/show_stats.go line 126 at r4 (raw file):
%s histogram, %s
If partialStatsVerActive is false, this will leave a trailing comma after histogram, which will throw a SQL error. So I think the query needs to be something like "avgSize", %s histogram %s FROM and then fullStatisticIDCol needs to be something like ,"fullStatisticID".
(Also indentation is off here 😛)
pkg/sql/catalog/systemschema/system.go line 268 at r3 (raw file):
Previously, faizaanmadhani (Faizaan Madhani) wrote…
I still think it would be better to keep the columns as NULL. It's not very complicated to translate between a tree.DNull datum and the value that we need and we do it consistently for other nullable fields (notably histograms -- I think histogram_id is NULL for multi-column statistics but an id for single column statistics, even though no histogram_id would ever be 0). Additionally, from the users perspective, it makes more sense that the column is NULL rather than 0 because 0 may indicate it references to a full statistic even though no statisticID can be 0.
👍 Good points! Never mind.
pkg/sql/execinfrapb/processors_table_stats.proto line 61 at r4 (raw file):
// statistics. It is the statistic id of the full statistic that this partial // statistic was derived from. optional uint64 full_statistics_id = 8 [(gogoproto.customname) = "FullStatisticsID", (gogoproto.nullable) = false];
nit: I swear this is the last one, could we also make this one singular to match? :sorry_face:
pkg/sql/stats/stats_cache.go line 769 at r4 (raw file):
"avgSize", %s histogram,
Same problem here: I think it needs to be "avgSize", %s histogram %s FROM without the comma after histogram, and the comma will need to be added to fullStatisticIDCol.
pkg/sql/stats/table_statistic.proto line 65 at r3 (raw file):
Previously, faizaanmadhani (Faizaan Madhani) wrote…
So, we can't make it true because doing so throws this error when running
./dev gen protobuf.ERROR: field TableStatisticProto.FullStatisticsId is a native type and in proto3 syntax with nullable=false there exists conflicting implementations when encoding zero values--gogoroach_out: protoc-gen-gogoroach: Plugin failed with status code 1. 2022/12/16 15:38:53 error running protoc: exit status 1However, I removed them and I think this is just true by default because it didn't change the resulting generated go file.
(You might hate me for this) nit: could we also make this one singular? (i.e. "full_statistic_id" and "FullStatisticID" instead of "full_statistics_id" and "FullStatisticsID"). :ducks_flying_shoe:
ba07f2f to
f204793
Compare
This commit adds a column to system.table_statistics used by partial statistics to store an id that references the full statistic that the partial statistics was derived from. For full statistics, the value in this column will be NULL. This commit updates a migration/upgrade that just included the partialPredicate column. Epic: CRDB-19449 Release note (sql change): system.table_statistics now contains a column called fullStatisticsID to store an id referencing the full table statistic the partial statistic was derived from.
f204793 to
aab61b6
Compare
faizaanmadhani
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @rytaft)
pkg/sql/alter_table.go line 1364 at r4 (raw file):
Previously, michae2 (Michael Erickson) wrote…
Will this sometimes be 0? Do we need to do the same
interface{}stuff we're doing above for the predicate to insertNULL?
Yup! Done.
pkg/sql/show_stats.go line 126 at r4 (raw file):
Previously, michae2 (Michael Erickson) wrote…
If
partialStatsVerActiveis false, this will leave a trailing comma afterhistogram,which will throw a SQL error. So I think the query needs to be something like"avgSize", %s histogram %s FROMand thenfullStatisticIDColneeds to be something like,"fullStatisticID".(Also indentation is off here 😛)
Done.
pkg/sql/stats/stats_cache.go line 769 at r4 (raw file):
Previously, michae2 (Michael Erickson) wrote…
Same problem here: I think it needs to be
"avgSize", %s histogram %s FROMwithout the comma after histogram, and the comma will need to be added tofullStatisticIDCol.
Done.
pkg/sql/stats/table_statistic.proto line 65 at r3 (raw file):
Previously, michae2 (Michael Erickson) wrote…
(You might hate me for this) nit: could we also make this one singular? (i.e. "full_statistic_id" and "FullStatisticID" instead of "full_statistics_id" and "FullStatisticsID"). :ducks_flying_shoe:
Done.
michae2
left a comment
There was a problem hiding this comment.
Reviewed 2 of 13 files at r4, 9 of 9 files at r5, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @rytaft)
|
TFTRs!! 🎉 bors r=michae2 |
|
Build failed (retrying...): |
|
Build failed (retrying...): |
|
Build succeeded: |
This commit adds a column to system.table_statistics used by partial statistics to store an id that references the full statistic that the partial statistics was derived from. For full statistics, the value in this column will be NULL.
This commit updates a migration/upgrade that just included the partialPredicate column.
Epic: CRDB-19449
Release note (sql change): system.table_statistics now contains a column called fullStatisticsID to store an id referencing the full table statistic the partial statistic was derived from.