Skip to content

Add dense_vector equality and inequality support in ES|QL#140005

Merged
mromaios merged 14 commits intoelastic:mainfrom
majiayu000:dense-vector-equality-support
Mar 11, 2026
Merged

Add dense_vector equality and inequality support in ES|QL#140005
mromaios merged 14 commits intoelastic:mainfrom
majiayu000:dense-vector-equality-support

Conversation

@majiayu000
Copy link
Copy Markdown
Contributor

Summary

Adds support for binary comparison operators (==, !=) for dense_vector type in ES|QL.

Resolves #139929

Changes

  • Added EqualsDenseVectorEvaluator and NotEqualsDenseVectorEvaluator classes to handle dense_vector comparison
  • Updated Equals.java and NotEquals.java with dense_vector support in evaluatorMap and @param annotations
  • Added DENSE_VECTOR_EQUALITY capability in EsqlCapabilities
  • Updated test classes with dense_vector test cases
  • Added csv-spec tests for dense_vector equality operations

Two dense vectors are equal if they have the same dimensions and all corresponding elements are equal.

Test plan

  • EqualsTests with dense_vector test cases
  • NotEqualsTests with dense_vector test cases
  • EqualsErrorTests updated
  • NotEqualsErrorTests updated
  • dense_vector.csv-spec integration tests

@cla-checker-service
Copy link
Copy Markdown

cla-checker-service bot commented Dec 26, 2025

💚 CLA has been signed

@elasticsearchmachine elasticsearchmachine added v9.4.0 needs:triage Requires assignment of a team area label external-contributor Pull request authored by a developer outside the Elasticsearch team labels Dec 26, 2025
@github-actions
Copy link
Copy Markdown
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

majiayu000 added a commit to majiayu000/elasticsearch that referenced this pull request Dec 26, 2025
Signed-off-by: majiayu000 <1835304752@qq.com>
@nik9000 nik9000 requested a review from carlosdelest December 26, 2025 12:58
@nik9000 nik9000 added :Search Relevance/ES|QL Search functionality in ES|QL and removed needs:triage Requires assignment of a team area label labels Dec 26, 2025
@nik9000
Copy link
Copy Markdown
Member

nik9000 commented Dec 26, 2025

@carlosdelest, could you look at this one?

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Dec 26, 2025
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@majiayu000
Copy link
Copy Markdown
Contributor Author

@elasticmachine recheck CLA

Copy link
Copy Markdown
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

Some comments about adding test cases - can you please address them @majiayu000 ?

suppliers.add(
new TestCaseSupplier("<dense_vector>, <different dense_vector>", List.of(DataType.DENSE_VECTOR, DataType.DENSE_VECTOR), () -> {
List<Float> left = randomDenseVector();
List<Float> right = randomDenseVector();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice thinking about vectors actually being different! I think it would be better to use other testing constructs for this:

Suggested change
List<Float> right = randomDenseVector();
List<Float> right = randomValueOtherThan(left, EsTestCase::randomDenseVector());

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in 0fd20d0

List<Float> left = randomDenseVector();
List<Float> right = randomDenseVector();
// Make sure vectors are different
if (left.equals(right) && left.size() > 0) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in 0fd20d0

@carlosdelest carlosdelest self-assigned this Jan 7, 2026
majiayu000 added a commit to majiayu000/elasticsearch that referenced this pull request Jan 7, 2026
@majiayu000
Copy link
Copy Markdown
Contributor Author

Hi @carlosdelest, I have updated the PR to address your review comments

Comment on lines +57 to +61
required_capability: dense_vector_field_type_released
required_capability: dense_vector_agg_metric_double_if_version
required_capability: l2_norm_vector_similarity_function
required_capability: generic_vector_format
required_capability: dense_vector_equality
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use only required_capability with the needed capabilities for the test to work - in this case:

Suggested change
required_capability: dense_vector_field_type_released
required_capability: dense_vector_agg_metric_double_if_version
required_capability: l2_norm_vector_similarity_function
required_capability: generic_vector_format
required_capability: dense_vector_equality
required_capability: dense_vector_equality

We can skip checking other capabilities as they are superseded by dense_vector_equality, which is newer

@majiayu000
Copy link
Copy Markdown
Contributor Author

Hi @carlosdelest, thanks for the additional feedback! I have updated the CSV specs to:

  1. Simplify the capabilities to only require dense_vector_equality as suggested.
  2. Add test cases for dimension mismatch (e.g., denseVectorCompareDifferentDimensions) in all 4 spec files.
    Let me know if this looks good!

@carlosdelest
Copy link
Copy Markdown
Member

@elasticsearchmachine test this please

@carlosdelest
Copy link
Copy Markdown
Member

@elasticsearchmachine test this please

@carlosdelest
Copy link
Copy Markdown
Member

@majiayu000 I've added some commits to address some issues. I see that we have some compile errors in EqualsTests and NotEqualsTests, can you please address them?

Also, I liked these versions better when they were testing equality and inequality. Why did you change that to a fixed amount of random tests? 🤔 The original approach was better IMO

@majiayu000
Copy link
Copy Markdown
Contributor Author

@carlosdelest Thanks for the review!

I've fixed the compilation errors in EqualsTests and NotEqualsTests (missing static imports). Regarding the test coverage, you were right — the fixed set of random tests wasn't ideal. I've refactored it to use the TestCaseSupplier.casesCrossProduct approach with a stable set of random vectors. This ensures we test the full cross-product (both equality and inequality scenarios) while keeping the implementation clean.

Ready for another look!

@majiayu000 majiayu000 force-pushed the dense-vector-equality-support branch 2 times, most recently from c89ab28 to f6d34e5 Compare January 9, 2026 14:47
@carlosdelest
Copy link
Copy Markdown
Member

@majiayu000 thanks for iterating on this! I'm afraid that I don't see the changes you mention on the tests. Can you please double check?

Also, please don't force push - it's easier for me to review with the full commit history 👍

@majiayu000
Copy link
Copy Markdown
Contributor Author

majiayu000 commented Jan 10, 2026

@carlosdelest thanks for checking. I've double checked the commit f6d34e5 and it contains the changes for EqualsTests and NotEqualsTests (imports and randomDenseVector logic). I also pushed a new commit cdfa93d with the same refactoring for InsensitiveEqualsTests.

Please double check f6d34e5 for EqualsTests/NotEqualsTests and cdfa93d for InsensitiveEqualsTests. I didn't force push, so the history is preserved.

@mromaios
Copy link
Copy Markdown
Contributor

mromaios commented Mar 2, 2026

Hey @majiayu000, addressed some pending test comments in 0fd20d0 and run the tests, but the csv-spec tests are failing.

From a quick look, it's two issues:

  1. We should wrap the arrays with to_dense_vector() as these operators don't have implicit casting
  2. In some tests where the value in the column is null, the expected one should be null too I think, but it's currently missing from the test: e.g. here

If you can have a look when you get the chance?
Thanks!

@mromaios
Copy link
Copy Markdown
Contributor

mromaios commented Mar 8, 2026

@elasticsearchmachine test this please

@majiayu000 majiayu000 force-pushed the dense-vector-equality-support branch from a3ced57 to a6944a2 Compare March 9, 2026 03:26
@majiayu000
Copy link
Copy Markdown
Contributor Author

@mromaios Both issues fixed — wrapped arrays with to_dense_vector() and added missing null expectations. Rebased on latest main.

@mromaios
Copy link
Copy Markdown
Contributor

mromaios commented Mar 9, 2026

Thanks @majiayu000! I think this looks good now. Took care of some compilation errors after an interface move.

Running CI, (hopefully for the last time 🤞)

@mromaios
Copy link
Copy Markdown
Contributor

mromaios commented Mar 9, 2026

@elasticsearchmachine test this please

@mromaios
Copy link
Copy Markdown
Contributor

mromaios commented Mar 9, 2026

@elasticsearchmachine test this please

@mromaios
Copy link
Copy Markdown
Contributor

mromaios commented Mar 9, 2026

@elasticsearchmachine test this please

@mromaios
Copy link
Copy Markdown
Contributor

@elasticmachine run elasticsearch-ci/pr-upgrade-part-3, elasticsearch-ci/rest-compatibility please

Copy link
Copy Markdown
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for your work @majiayu000 and @mromaios !

@mromaios
Copy link
Copy Markdown
Contributor

Thanks again for the contribution @majiayu000 and for the many rounds of reviews @carlosdelest and @kkharbas.
I'll merge this in once we have a green CI.

@mromaios
Copy link
Copy Markdown
Contributor

@elasticmachine run Elasticsearch Serverless Checks, elasticsearch-ci/rest-compatibility please

@mromaios
Copy link
Copy Markdown
Contributor

@elasticmachine run elasticsearch-ci/rest-compatibility please

@mromaios
Copy link
Copy Markdown
Contributor

@elasticsearchmachine test this please

@mromaios mromaios merged commit 187e39f into elastic:main Mar 11, 2026
37 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :Search Relevance/ES|QL Search functionality in ES|QL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ES|QL - dense_vector binary equality and inequality (==, !=) support

6 participants