Summary
gbrain query (hybrid) returns the same top documents with identical scores for completely unrelated queries — including nonsense input. gbrain search (BM25) is query-responsive and correct. The hybrid/vector path appears to ignore the query.
Environment
- gbrain 0.41.11.1 (also reproduced on 0.35.6.0 before upgrading)
- Engine: postgres
- Embedding:
ollama:nomic-embed-text, 768d (gbrain doctor: embedding_provider ✓, embedding_width_consistency OK)
- macOS / bun
Repro
gbrain query "wyckoff accumulation" # top-3 docs A,B,C @ ~0.99/0.98/0.97
gbrain query "banana recipe quantum" # IDENTICAL top-3 A,B,C @ IDENTICAL scores
gbrain query "..." --no-expand # still identical
gbrain search "wyckoff accumulation" # correct, query-responsive
gbrain search "banana recipe quantum" # different/correct results
Different queries → identical query results and identical cosine scores ⇒ the query vector used in the hybrid search appears fixed/independent of the input.
Ruled out (investigated)
- Embedding model: ollama
nomic-embed-text returns distinct vectors for distinct inputs (cosine ≈0.47 between the two example queries, via both /api/embeddings and /v1/embeddings, with and without input_type).
- Semantic query cache:
gbrain cache clear --yes, then fresh (cache-miss) queries are still identical.
- Query expansion:
--no-expand still identical.
- BM25 (
search): correct and query-responsive — data + BM25 path are fine.
Hypothesis
The query-side embedding (embedQuery → gateway embed([text], {inputType:'query'})) or its use in core/search/hybrid.ts may produce/use a fixed query vector. On every query this warning prints: [ai.gateway] recipe "google" declares an embedding touchpoint without max_batch_tokens — possibly related to query-embedding routing, even though the configured embedding model is ollama:nomic-embed-text.
Happy to provide more detail or test a patch.
Summary
gbrain query(hybrid) returns the same top documents with identical scores for completely unrelated queries — including nonsense input.gbrain search(BM25) is query-responsive and correct. The hybrid/vector path appears to ignore the query.Environment
ollama:nomic-embed-text, 768d (gbrain doctor:embedding_provider ✓,embedding_width_consistencyOK)Repro
Different queries → identical
queryresults and identical cosine scores ⇒ the query vector used in the hybrid search appears fixed/independent of the input.Ruled out (investigated)
nomic-embed-textreturns distinct vectors for distinct inputs (cosine ≈0.47 between the two example queries, via both/api/embeddingsand/v1/embeddings, with and withoutinput_type).gbrain cache clear --yes, then fresh (cache-miss) queries are still identical.--no-expandstill identical.search): correct and query-responsive — data + BM25 path are fine.Hypothesis
The query-side embedding (
embedQuery→ gatewayembed([text], {inputType:'query'})) or its use incore/search/hybrid.tsmay produce/use a fixed query vector. On every query this warning prints:[ai.gateway] recipe "google" declares an embedding touchpoint without max_batch_tokens— possibly related to query-embedding routing, even though the configured embedding model isollama:nomic-embed-text.Happy to provide more detail or test a patch.