ESQL - Add semantic_text support for knn function#133806
ESQL - Add semantic_text support for knn function#133806carlosdelest merged 24 commits intoelastic:mainfrom
Conversation
…-semantic-text' into non-issue/esql-knn-support-semantic-text
| import org.junit.ClassRule; | ||
|
|
||
| @ThreadLeakFilters(filters = TestClustersThreadFilter.class) | ||
| public class KnnSemanticTextIT extends KnnSemanticTextTestCase { |
There was a problem hiding this comment.
Added both single and multi node ITs, that extend from a common superclass (used SeamnticMatchTestCase as a template)
There was a problem hiding this comment.
Adds a test inference endpoint for text embedding tasks
There was a problem hiding this comment.
Adds a semantic_text field that uses a dense_vector field
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
…-semantic-text' into non-issue/esql-knn-support-semantic-text
| /** | ||
| * Tests kNN queries on semantic_text fields. Mostly checks errors on the data node that can't be checked in other tests. | ||
| */ | ||
| public class KnnSemanticTextTestCase extends ESRestTestCase { |
There was a problem hiding this comment.
I wonder if we should add test cases here for using knn over 2 different dense endpoints. Note that this may conflict with this open PR: #133675
There was a problem hiding this comment.
I think it's not needed, as the SemanticQueryBuilder is not used on this approach - it's just knn being done over two dense_vector fields. The inference_id mechanism for retrieving the embeddings from the text is not used in this function.
|
|
||
| private TypeResolution resolveField() { | ||
| return isNotNull(field(), sourceText(), FIRST).and(isType(field(), dt -> dt == DENSE_VECTOR, sourceText(), FIRST, "dense_vector")); | ||
| return isNotNull(field(), sourceText(), FIRST).and( |
There was a problem hiding this comment.
should we also update:
and regenerate the docs?
…pport-semantic-text # Conflicts: # x-pack/plugin/esql/qa/testFixtures/src/main/resources/knn-function.csv-spec # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/vector/Knn.java # x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/VerifierTests.java
🔍 Preview links for changed docs |
ℹ️ Important: Docs version tagging👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version. We use applies_to tags to mark version-specific features and changes. Expand for a quick overviewWhen to use applies_to tags:✅ At the page level to indicate which products/deployments the content applies to (mandatory) What NOT to do:❌ Don't remove or replace information that applies to an older version 🤔 Need help?
|
knnfunction should be used withsemantic_textfields, provided that the inference service is a text embedding task.As
field_capsreturns text as data type forsemantic_textfields, we need to check fortexton the knn verification. The query will fail on the data node in case the underlying field mapping is not adense_vector.Closes #132066