Skip to content

sql/stats: merged partial statistics have inaccurate distinct counts #126730

@Uzair5162

Description

@Uzair5162

The distinct count in merged statistics between full and partial statistics doesn't seem to be consistent with the distinct counts of the stats being merged for larger row counts.

Example:

SET enable_create_stats_using_extremes = on;
CREATE TABLE a (a PRIMARY KEY) WITH (sql_stats_automatic_collection_enabled = false) AS SELECT generate_series(0, 10000);
CREATE STATISTICS a_full ON a FROM a;
SHOW STATISTICS FOR TABLE a;

INSERT INTO a VALUES (10001);
CREATE STATISTICS a_partial ON a FROM a USING EXTREMES;
SHOW STATISTICS FOR TABLE a WITH MERGE;

Observe that the full stat estimates a distinct count of 9920, and the partial stat estimates 1 distinct value beyond the previous extremes. However, the merged stat only estimates 9803 (where 9921 is expected).

Jira issue: CRDB-40092

Metadata

Metadata

Assignees

Labels

A-sql-table-statsTable statistics (and their automatic refresh).C-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-sql-queriesSQL Queries Team

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions