-
Notifications
You must be signed in to change notification settings - Fork 4.1k
opt: histogram upper bounds should not be outside of min/max values for a type #76887
Copy link
Copy link
Closed
Closed
Copy link
Labels
C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.C-cleanupTech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.T-sql-queriesSQL Queries TeamSQL Queries Team
Description
I've observed INT2 and INT4 column statistics in a statement bundle with histogram upper bounds that were greater than the max INT2 or INT4, and less than the minimum INT2 or INT4. For example:
{
"columns": [
"a"
],
"created_at": "2022-01-01 12:34:56.000000",
"distinct_count": 18,
"histo_buckets": [
{
"distinct_range": 0,
"num_eq": 0,
"num_range": 0,
"upper_bound": "-9223372036854775808"
},
{
"distinct_range": 4.5,
"num_eq": 14212895,
"num_range": 4,
"upper_bound": "0"
},
{
"distinct_range": 7,
"num_eq": 6249,
"num_range": 17024855,
"upper_bound": "8"
},
{
"distinct_range": 4.5,
"num_eq": 0,
"num_range": 4,
"upper_bound": "9223372036854775807"
}
],
"histo_col_type": "INT2",
"histo_version": 1,
"name": "__auto__",
"null_count": 0,
"row_count": 31244007
},
This is problematic because it may lead to incorrect row count estimates, and injecting these stats manually results in an error:
defaultdb> CREATE TABLE t (a INT2);
CREATE TABLE
defaultdb> ALTER TABLE t INJECT STATISTICS '[
{
"columns": [
"a"
],
"created_at": "2022-01-01 12:34:56.000000",
"distinct_count": 18,
"histo_buckets": [
{
"distinct_range": 0,
"num_eq": 0,
"num_range": 0,
"upper_bound": "-9223372036854775808"
},
{
"distinct_range": 4.5,
"num_eq": 14212895,
"num_range": 4,
"upper_bound": "0"
},
{
"distinct_range": 7,
"num_eq": 6249,
"num_range": 17024855,
"upper_bound": "8"
},
{
"distinct_range": 4.5,
"num_eq": 0,
"num_range": 4,
"upper_bound": "9223372036854775807"
}
],
"histo_col_type": "INT2",
"histo_version": 1,
"name": "__auto__",
"null_count": 0,
"row_count": 31244007
}]';
ERROR: integer out of range for type int2
SQLSTATE: 22003
I'm not sure what caused this and I've been unable to reproduce it so far. The statement bundle was from CRDB v21.2.5.
Jira issue: CRDB-13311
Epic CRDB-16930
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
C-bugCode not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.C-cleanupTech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior.T-sql-queriesSQL Queries TeamSQL Queries Team
Type
Projects
Status
Done