LocalAI version:
v4.3.6-59-gef80a0e8 (ef80a0e)
Environment, CPU architecture, OS, and Version:
Linux x86_64 / Compute Node configuration / PostgreSQL backend with pgvector
Describe the bug
When querying a large database collection (~2.4 million rows) via the /api/agents/collections/{name}/search endpoint using the PostgreSQL backend, the internal query engine constructs a SQL layout that completely blinds the PostgreSQL query planner.
Instead of executing an index-backed nearest-neighbor search, it triggers a full sequential table scan (seqscan). On massive datasets, this calculation takes longer than 10 seconds, consistently hitting LocalAI's internal client-side context deadline and dropping the request with an empty response payload ({"count":0,"results":[]}).
To Reproduce
- Populate a collection table via the PostgreSQL backend with a large dataset (>2,000,000 rows).
- Build a standard
pgvector HNSW index on the embedding column.
- Submit a standard vector search request to the API collection endpoint.
- Check PostgreSQL engine logs to observe a statement cancellation precisely at the 10-second mark.
Expected behavior
The query engine should leverage the underlying pgvector HNSW/IVFFlat index structure to return the nearest neighbors within milliseconds, regardless of the overall row count.
Logs:
Jun 05 08:38:34 compute postgres[1989806]: LOG: execute stmtcache_c2e7e...:
Jun 05 08:38:34 compute postgres[1989806]: SELECT
Jun 05 08:38:34 compute postgres[1989806]: id::text,
Jun 05 08:38:34 compute postgres[1989806]: COALESCE(title, '') as title,
Jun 05 08:38:34 compute postgres[1989806]: content,
Jun 05 08:38:34 compute postgres[1989806]: metadata,
Jun 05 08:38:34 compute postgres[1989806]: (
Jun 05 08:38:34 compute postgres[1989806]: COALESCE((1 - (embedding <=> $3::vector)), 0) * $4
Jun 05 08:38:34 compute postgres[1989806]: ) as similarity
Jun 05 08:38:34 compute postgres[1989806]: FROM documents_literature
Jun 05 08:38:34 compute postgres[1989806]: WHERE embedding IS NOT NULL
Jun 05 08:38:34 compute postgres[1989806]: ORDER BY similarity DESC
Jun 05 08:38:34 compute postgres[1989806]: LIMIT $5
...
Jun 05 08:38:44 compute postgres[1989806]: ERROR: canceling statement due to statement timeout
Additional context:
The mathematical abstraction wrapper used to calculate a combined similarity score inside the SELECT clause completely invalidates pgvector's HNSW indexing path:
- Index Invalidation: PostgreSQL's HNSW implementation can only optimize an unadulterated, raw distance operator sort path directly in the clause:
ORDER BY embedding <=> $3::vector.
- The Scalar Trap: Because the query targets a dynamic scalar alias (
ORDER BY similarity DESC), the query planner cannot walk the HNSW index tree. It is mathematically forced to calculate the distance operator, handle the COALESCE logic, and multiply it by the scalar weight parameter ($4) for all 2.4 million rows sequentially before sorting in-memory.
Verification of Underlying Index Health:
Extracting the target vector out of the query payload and executing a raw database connection via native Go source code completely bypasses the bug and works instantaneously:
// Direct query execution using the raw distance operator natively runs in <10ms
sqlQuery := `
SELECT id::text, COALESCE(title, ''), content
FROM documents_literature
WHERE embedding IS NOT NULL
ORDER BY embedding <=> $1::vector
LIMIT 3;`
Suggested Fix:
Modify the backend query builder so that if a vector search is being executed, the distance operator is passed natively into the ORDER BY clause without mathematical modifiers or scalar transformations that break index parsing.
Note: This technical optimization analysis, logs trace verification, and root-cause mapping were investigated and drafted in collaboration with Google's Gemini.
LocalAI version:
v4.3.6-59-gef80a0e8 (ef80a0e)
Environment, CPU architecture, OS, and Version:
Linux x86_64 / Compute Node configuration / PostgreSQL backend with pgvector
Describe the bug
When querying a large database collection (~2.4 million rows) via the
/api/agents/collections/{name}/searchendpoint using the PostgreSQL backend, the internal query engine constructs a SQL layout that completely blinds the PostgreSQL query planner.Instead of executing an index-backed nearest-neighbor search, it triggers a full sequential table scan (
seqscan). On massive datasets, this calculation takes longer than 10 seconds, consistently hitting LocalAI's internal client-side context deadline and dropping the request with an empty response payload ({"count":0,"results":[]}).To Reproduce
pgvectorHNSW index on theembeddingcolumn.Expected behavior
The query engine should leverage the underlying
pgvectorHNSW/IVFFlat index structure to return the nearest neighbors within milliseconds, regardless of the overall row count.Logs:
Additional context:
The mathematical abstraction wrapper used to calculate a combined
similarityscore inside theSELECTclause completely invalidatespgvector's HNSW indexing path:ORDER BY embedding <=> $3::vector.ORDER BY similarity DESC), the query planner cannot walk the HNSW index tree. It is mathematically forced to calculate the distance operator, handle theCOALESCElogic, and multiply it by the scalar weight parameter ($4) for all 2.4 million rows sequentially before sorting in-memory.Verification of Underlying Index Health:
Extracting the target vector out of the query payload and executing a raw database connection via native Go source code completely bypasses the bug and works instantaneously:
Suggested Fix:
Modify the backend query builder so that if a vector search is being executed, the distance operator is passed natively into the
ORDER BYclause without mathematical modifiers or scalar transformations that break index parsing.Note: This technical optimization analysis, logs trace verification, and root-cause mapping were investigated and drafted in collaboration with Google's Gemini.