Add a new "lookup" query vector builder#141488
Add a new "lookup" query vector builder#141488elasticsearchmachine merged 23 commits intoelastic:mainfrom
Conversation
| client.prepareSearch(index) | ||
| .setQuery(new IdsQueryBuilder().addIds(id)) | ||
| .setRouting(routing) | ||
| .setPreference("_local") | ||
| .setFetchSource(false) | ||
| .storedFields(_NONE_) | ||
| .addDocValueField(path) | ||
| .setSize(1) | ||
| .execute(ActionListener.wrap(searchResponse -> { |
There was a problem hiding this comment.
I thought we could start with plain get but this is the optimised version ;).
I am not sure whether the fact that get uses a different thread pool is a feature or not though.
In this case, we would expect that the get hits cold data so we'd leave more room to parallelise the reads.
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
jimczi
left a comment
There was a problem hiding this comment.
LGTM, we can start with _search to do the optimised lookup.
Let's still plan for a follow up to allow doc values retrieval in get so that we can use the natural api to do the lookup?
| client.prepareSearch(index) | ||
| .setQuery(new IdsQueryBuilder().addIds(id)) | ||
| .setRouting(routing) | ||
| .setPreference("_local") | ||
| .setFetchSource(false) | ||
| .storedFields(_NONE_) | ||
| .addDocValueField(path) | ||
| .setSize(1) | ||
| .execute(ActionListener.wrap(searchResponse -> { |
There was a problem hiding this comment.
I thought we could start with plain get but this is the optimised version ;).
I am not sure whether the fact that get uses a different thread pool is a feature or not though.
In this case, we would expect that the get hits cold data so we'd leave more room to parallelise the reads.
🔍 Preview links for changed docs |
ℹ️ Important: Docs version tagging👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version. We use applies_to tags to mark version-specific features and changes. Expand for a quick overviewWhen to use applies_to tags:✅ At the page level to indicate which products/deployments the content applies to (mandatory) What NOT to do:❌ Don't remove or replace information that applies to an older version 🤔 Need help?
|
Mikep86
left a comment
There was a problem hiding this comment.
Heads up that I'm almost certain that InterceptedInferenceKnnVectorQueryBuilder will need to be updated to handle this new query vector builder. That class was written with the assumption that TextEmbeddingQueryVectorBuilder was the only query vector builder type (see here, for example), which is no longer true.
| } | ||
|
|
||
| // TODO is this what we want to test? | ||
| public void testKnnQueryLookupCcsMinimizeRoundTripsTrue() throws Exception { |
There was a problem hiding this comment.
@Mikep86 here is the sort of thing I am thinking about. Basically, let's send a lookupqueryvectorbuilder that points to the local cluster (mandatory for lookup) and just points to the knn vector field. It does assume there is a value there (or it will throw a not-found).
There was a problem hiding this comment.
These tests look like a good start to me. I'll branch off of this and make some changes to the interception logic to stabilize the tests. Then we can evaluate if we want those interception logic changes to be in a separate PR or merged into this. Does that work?
There was a problem hiding this comment.
I know, this one is on my list to do ASAP. Ranger duty and all...
There was a problem hiding this comment.
@Mikep86 you good! I just know I forget about things.
There was a problem hiding this comment.
I created an issue for this to track my work on it: #142141
There was a problem hiding this comment.
Now that #142254 is merged, LookupQueryVectorBuilder should work with semantic search CCS 🎉
I don't think we need to explicitly test LookupQueryVectorBuilder with CCS here, as we have tests with a generic query vector builder in #142254. As long as LookupQueryVectorBuilder follows the contract of throwing in buildVector on incomplete input or an error, we should be good. WDYT?
docs/reference/query-languages/query-dsl/query-dsl-knn-query.md
Outdated
Show resolved
Hide resolved
| Elasticsearch supports knn queries with a vector that is stored within an index. | ||
|
|
||
| Here is an example utilizing lookup. |
There was a problem hiding this comment.
can you clarify what the lookup is actually doing here? "that looks up the vector in a separate index using the lookup parameter" maybe?
docs/reference/query-languages/query-dsl/query-dsl-knn-query.md
Outdated
Show resolved
Hide resolved
Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
…into vector-look-qvp
This adds a new "lookup" query vector builder: This significantly simplifies the user experience when attempting to find other vectors that are similar to a vector stored within an index. closes: elastic#141069
Adding docs for the new `lookup` query vector builder. Waiting on elastic/elasticsearch#141488 to be merged.
This adds a new "lookup" query vector builder:
This significantly simplifies the user experience when attempting to find other vectors that are similar to a vector stored within an index.
closes: #141069