-
Notifications
You must be signed in to change notification settings - Fork 4.1k
sql: forward index histogram on an trigram inverted index column causes internal errors #84529
Description
When the optimizer sees a histogram on a normal column of type text (now both forward indexable and inverted indexable with gin_trgm_ops) while trying to build spans for an inverted index, an error is returned due to the type mismatch of the bytes (inverted) histogram it expects and the text (forward) histogram it finds.
This is basically a manifestation of #50655, which covers the fact that histograms don't have a "type" attribute that could help distinguish inverted from forward sketches/histograms, or distinguish different types of inverted histograms from each other.
Here is a simple repro:
demo@127.0.0.1:26257/defaultdb> create table a (a) as select 'hi';
CREATE TABLE AS
demo@127.0.0.1:26257/defaultdb> analyze a;
ANALYZE
demo@127.0.0.1:26257/defaultdb> create inverted index on a(a gin_trgm_ops) ;
CREATE INDEX
demo@127.0.0.1:26257/defaultdb> explain select * from a where a like '%ffooo%';
ERROR: unsupported comparison: string to bytes
SQLSTATE: 42804
If analyze is re-run, all is well again. I think this is because the most recent histogram for the column becomes the inverted column. This raises the question of whether an ordinary forward scan will find the wrong histogram. I wasn't able to easily discover any internal errors in this state, but I wouldn't be surprised if things were going wrong in more subtle ways.
demo@127.0.0.1:26257/defaultdb> analyze a;
ANALYZE
Time: 18ms total (execution 18ms / network 0ms)
demo@127.0.0.1:26257/defaultdb> explain select * from a where a like '%ffooo%';
info
---------------------------------------------------------------------------------------------
distribution: local
vectorized: true
• filter
│ estimated row count: 0
│ filter: a LIKE '%ffooo%'
│
└── • index join
│ estimated row count: 0
│ table: a@a_pkey
│
└── • inverted filter
│ inverted column: a_inverted_key
│ num spans: 3
│
└── • scan
estimated row count: 0 (<0.01% of the table; stats collected 2 seconds ago)
table: a@a_a_idx
spans: 3 spans
(19 rows)
Time: 1ms total (execution 1ms / network 0ms)
Jira issue: CRDB-17708
Metadata
Metadata
Assignees
Labels
Type
Projects
Status