Skip to content

Add a new "lookup" query vector builder#141488

Merged
elasticsearchmachine merged 23 commits intoelastic:mainfrom
benwtrent:vector-look-qvp
Feb 28, 2026
Merged

Add a new "lookup" query vector builder#141488
elasticsearchmachine merged 23 commits intoelastic:mainfrom
benwtrent:vector-look-qvp

Conversation

@benwtrent
Copy link
Copy Markdown
Member

@benwtrent benwtrent commented Jan 28, 2026

This adds a new "lookup" query vector builder:

This significantly simplifies the user experience when attempting to find other vectors that are similar to a vector stored within an index.

closes: #141069

Comment on lines +100 to +108
client.prepareSearch(index)
.setQuery(new IdsQueryBuilder().addIds(id))
.setRouting(routing)
.setPreference("_local")
.setFetchSource(false)
.storedFields(_NONE_)
.addDocValueField(path)
.setSize(1)
.execute(ActionListener.wrap(searchResponse -> {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jimczi is this what you had in mind?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we could start with plain get but this is the optimised version ;).
I am not sure whether the fact that get uses a different thread pool is a feature or not though.
In this case, we would expect that the get hits cold data so we'd leave more room to parallelise the reads.

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Jan 28, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

Copy link
Copy Markdown
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, we can start with _search to do the optimised lookup.
Let's still plan for a follow up to allow doc values retrieval in get so that we can use the natural api to do the lookup?

Comment on lines +100 to +108
client.prepareSearch(index)
.setQuery(new IdsQueryBuilder().addIds(id))
.setRouting(routing)
.setPreference("_local")
.setFetchSource(false)
.storedFields(_NONE_)
.addDocValueField(path)
.setSize(1)
.execute(ActionListener.wrap(searchResponse -> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we could start with plain get but this is the optimised version ;).
I am not sure whether the fact that get uses a different thread pool is a feature or not though.
In this case, we would expect that the get hits cold data so we'd leave more room to parallelise the reads.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Jan 29, 2026

🔍 Preview links for changed docs

@github-actions
Copy link
Copy Markdown
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

Copy link
Copy Markdown
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads up that I'm almost certain that InterceptedInferenceKnnVectorQueryBuilder will need to be updated to handle this new query vector builder. That class was written with the assumption that TextEmbeddingQueryVectorBuilder was the only query vector builder type (see here, for example), which is no longer true.

}

// TODO is this what we want to test?
public void testKnnQueryLookupCcsMinimizeRoundTripsTrue() throws Exception {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mikep86 here is the sort of thing I am thinking about. Basically, let's send a lookupqueryvectorbuilder that points to the local cluster (mandatory for lookup) and just points to the knn vector field. It does assume there is a value there (or it will throw a not-found).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests look like a good start to me. I'll branch off of this and make some changes to the interception logic to stabilize the tests. Then we can evaluate if we want those interception logic changes to be in a separate PR or merged into this. Does that work?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works great, thank you @Mikep86 !

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mikep86 friendly ping :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, this one is on my list to do ASAP. Ranger duty and all...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mikep86 you good! I just know I forget about things.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created an issue for this to track my work on it: #142141

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that #142254 is merged, LookupQueryVectorBuilder should work with semantic search CCS 🎉

I don't think we need to explicitly test LookupQueryVectorBuilder with CCS here, as we have tests with a generic query vector builder in #142254. As long as LookupQueryVectorBuilder follows the contract of throwing in buildVector on incomplete input or an error, we should be good. WDYT?

Comment on lines +368 to +370
Elasticsearch supports knn queries with a vector that is stored within an index.

Here is an example utilizing lookup.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you clarify what the lookup is actually doing here? "that looks up the vector in a separate index using the lookup parameter" maybe?

Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
@benwtrent benwtrent requested a review from a team as a code owner February 26, 2026 17:55
@benwtrent benwtrent added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Feb 26, 2026
@elasticsearchmachine elasticsearchmachine merged commit ce8f5f2 into elastic:main Feb 28, 2026
35 checks passed
tballison pushed a commit to tballison/elasticsearch that referenced this pull request Mar 3, 2026
This adds a new "lookup" query vector builder: 

This significantly simplifies the user experience when attempting to
find other vectors that are similar to a vector stored within an index.

closes: elastic#141069
benwtrent added a commit to elastic/docs-content that referenced this pull request Mar 5, 2026
Adding docs for the new `lookup` query vector builder. Waiting on
elastic/elasticsearch#141488 to be merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >enhancement :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add new "lookup" for query_vector_builder

5 participants