Skip to content

Update semantic search CCS to support generic query vector builders#142254

Merged
Mikep86 merged 28 commits intoelastic:mainfrom
Mikep86:semantic-search_support-generic-query-vector-builders
Feb 25, 2026
Merged

Update semantic search CCS to support generic query vector builders#142254
Mikep86 merged 28 commits intoelastic:mainfrom
Mikep86:semantic-search_support-generic-query-vector-builders

Conversation

@Mikep86
Copy link
Copy Markdown
Contributor

@Mikep86 Mikep86 commented Feb 10, 2026

The current semantic search CCS implementation assumes that the only QueryVectorBuilder implementation is TextEmbeddingQueryVectorBuilder, However, this will soon not be the case. There is already a PR to add a new "lookup" query vector builder, and more variants may be added later.

This PR updates the semantic search CCS implementation to support generic query vector builders. The high-level approach is:

  • Unless handling a special case, assume any given QueryVectorBuilder can be used to build a query vector on the coordinator node. If it cannot, the query vector builder's buildVector method is responsible for throwing an exception.
  • Add logic to intercepted knn queries to rewrite a query vector builder to a query vector on the coordinator node. The resulting query vector is stored in the original query that was intercepted.

TextEmbeddingQueryVectorBuilders still need special handling, because they can be used for semantic search even when incomplete. When a TextEmbeddingQueryVectorBuilder does not specify an inference ID, we extract the model text from it and use that to generate the inference results necessary to query the specified semantic_text field(s).

This approach also allows us to simplify the intercepted query logic overall. We can remove the concept of a general "inference ID override" and replace it with custom coordinator node rewrite actions that generate query vectors as necessary.

@Mikep86
Copy link
Copy Markdown
Contributor Author

Mikep86 commented Feb 10, 2026

@elasticmachine update branch

Comment on lines +384 to +385
// Other query vector builder types always require validation
queryVectorBuilder.validate();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this new interface. You still do the instanceof check before it, it isn't really providing anything.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I think I see how to do that. Some special handling would be required for TextEmbeddingQueryVectorBuilder, but all other query vector builders would be assumed to be complete and able to build vectors.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only special handling is that we inject the model to TextEmbeddingQueryVectorBuilder right? If thats the case, that is the only thing that ever requires special handling. Everything else should be treated as "it should just work and if you don't give us a vector, shame on you"

Copy link
Copy Markdown
Contributor Author

@Mikep86 Mikep86 Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Essentially yes. The special handling for TextEmbeddingQueryVectorBuilder is that if a model ID isn't specified, we pull the model text from it and then infer the inference ID(s) from the semantic text fields queried.

All other query vector builders should "just work" though, and if they don't, they should throw an error when buildVector is called.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ yes, I think this is good. I think the logic is that "rewrite as normal if NOT TextEmbeddingQueryVectorBuilder and if you are, let's handle our special case"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in c3da5be

@Mikep86 Mikep86 marked this pull request as ready for review February 13, 2026 20:20
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Feb 13, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

Copy link
Copy Markdown
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mikep86 its difficult to parse what is "refactoring" vs. actually fixing the bug. Could you extract the refactoring pieces into their own PR, that don't adjust behavior at all? This SEEMS like the new "QueryRewriteAction" things.

@Mikep86
Copy link
Copy Markdown
Contributor Author

Mikep86 commented Feb 23, 2026

@benwtrent Sure, but the refactoring pieces would just be moving the QueryRewriteAsyncAction extensions into dedicated files

@Mikep86 Mikep86 requested a review from benwtrent February 24, 2026 12:53
Comment on lines +198 to +225
QueryVectorBuilder queryVectorBuilder = originalQuery.queryVectorBuilder();
if (queryVectorBuilder != null) {
boolean registerAction = false;
if (queryVectorBuilder instanceof TextEmbeddingQueryVectorBuilder tevb) {
// TextEmbeddingQueryVectorBuilder is a special case. If a model ID is set, we register an action to generate
// the query vector. If not, the model text will be returned via getQuery() so that InferenceQueryUtils can
// generate the appropriate inference results for the inferred inference ID(s).
if (tevb.getModelId() != null) {
registerAction = true;
}
} else {
// We register an action to generate the query vector for all other query vector builders. If they cannot, buildVector()
// should throw an error indicating why.
registerAction = true;
}

if (registerAction) {
SetOnce<float[]> newQueryVectorSupplier = new SetOnce<>();
queryRewriteContext.registerUniqueAsyncAction(
new QueryVectorBuilderAsyncAction(queryVectorBuilder),
newQueryVectorSupplier::set
);
return new InterceptedInferenceKnnVectorQueryBuilder(queryBuilder, originalQuery, newQueryVectorSupplier);
}
}

return queryBuilder;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Copy link
Copy Markdown
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the knn specific things. This looks good.

I don't know anything about the changes to the InferenceQueryUtils class. But seems fall out from getInferenceIdOverride. I am not sure why we ever needed that interface. Where does the logic reside now? Was it ever needed for the fqdn inference id?

@Mikep86
Copy link
Copy Markdown
Contributor Author

Mikep86 commented Feb 24, 2026

@benwtrent

I don't know anything about the changes to the InferenceQueryUtils class. But seems fall out from getInferenceIdOverride. I am not sure why we ever needed that interface. Where does the logic reside now? Was it ever needed for the fqdn inference id?

Inference ID override was how we used to handle when the user provided an inference ID to the the query vector builder or sparse_vector query. Basically, InferenceQueryUtils would always perform query-time inference for intercepted queries, even for "complete" query vector builders. If the user didn't provide an inference ID, we inferred it from the semantic text field(s) queried. If they did, we set the override.

This has been simplified with this updated implementation. Now, InferenceQueryUtils is only used for query-time inference when inferring the inference ID(s) from semantic_text fields. In cases where the user explicitly provides an inference ID, we handle query-time inference directly in the intercepted query. Thus, we can remove the concept of an inference ID override in InferenceQueryUtils.

@Mikep86
Copy link
Copy Markdown
Contributor Author

Mikep86 commented Feb 25, 2026

@elasticmachine update branch

@Mikep86 Mikep86 merged commit 798eb10 into elastic:main Feb 25, 2026
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants