This originated from https://github.com/cockroachlabs/support/issues/2179. Basically we a get a RowCount of 1 w/o placeholders and a RowCount of 1/3 the columns distinct count w/ a placeholder. Here's a really straightforward distilled repro courtesy @DrewKimball :
exec-ddl
CREATE TABLE xy (x INT, y INT, INDEX (x) STORING (y));
----
exec-ddl
ALTER TABLE xy INJECT STATISTICS '[
{
"columns": ["x"],
"created_at": "2018-01-01 1:00:00.00000+00:00",
"row_count": 100000,
"distinct_count": 100000
},
{
"columns": ["y"],
"created_at": "2018-01-01 1:00:00.00000+00:00",
"row_count": 100000,
"distinct_count": 2
}
]';
----
opt format=show-all
WITH RECURSIVE cte(a,b) AS (
(SELECT * FROM xy WHERE x = 1)
UNION ALL
(SELECT * FROM cte WHERE False)
)
SELECT * FROM cte;
----
If you replace 1 with $1 the problem becomes apparent. Doesn't matter if you use this simple approach or assign-placeholders-opt query-args=(1) format=show-all.
I've narrowed it down to (sb *statisticsBuilder) applyConstraintSet but haven't completed the investigation. In the customer issue the result is a hash join vs a lookup join which is believed to be causing poor performance.
Jira issue: CRDB-25906
This originated from https://github.com/cockroachlabs/support/issues/2179. Basically we a get a RowCount of 1 w/o placeholders and a RowCount of 1/3 the columns distinct count w/ a placeholder. Here's a really straightforward distilled repro courtesy @DrewKimball :
If you replace 1 with $1 the problem becomes apparent. Doesn't matter if you use this simple approach or
assign-placeholders-opt query-args=(1) format=show-all.I've narrowed it down to
(sb *statisticsBuilder) applyConstraintSetbut haven't completed the investigation. In the customer issue the result is a hash join vs a lookup join which is believed to be causing poor performance.Jira issue: CRDB-25906