Skip to content

Support cosine similarity in kNN search#79500

Merged
jtibshirani merged 2 commits intoelastic:masterfrom
jtibshirani:cosine-similarity
Oct 21, 2021
Merged

Support cosine similarity in kNN search#79500
jtibshirani merged 2 commits intoelastic:masterfrom
jtibshirani:cosine-similarity

Conversation

@jtibshirani
Copy link
Copy Markdown
Contributor

This PR adds support for cosine similarity:

"mappings": {
  "properties": {
    "my_vector": {
      "type": "dense_vector",
      "dims": 128,
      "index": true,
      "similarity": "cosine"
    }
  }
}

Unlike dot_product, which requires vectors to be of unit length, this
similarity can handle vectors with any magnitude.

This PR also adds validation around dot_product to help catch mistakes. When
indexing vectors, we double-check that each vector has unit length. We also
check that kNN query vectors have unit length.

@jtibshirani jtibshirani added :Search/Search Search-related issues that do not fall into other categories v8.0.0 labels Oct 19, 2021
@jtibshirani jtibshirani mentioned this pull request Oct 19, 2021
17 tasks
@jtibshirani
Copy link
Copy Markdown
Contributor Author

I ran benchmarks and confirmed the new validation for dot_product does not significantly affect performance.

@jtibshirani jtibshirani marked this pull request as ready for review October 19, 2021 17:53
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Oct 19, 2021
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search (Team:Search)

@mayya-sharipova mayya-sharipova self-requested a review October 20, 2021 01:38
Copy link
Copy Markdown
Contributor

@mayya-sharipova mayya-sharipova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jtibshirani Thanks, LGTM!

@jtibshirani jtibshirani merged commit 7f01138 into elastic:master Oct 21, 2021
@jtibshirani jtibshirani deleted the cosine-similarity branch October 21, 2021 21:43
lockewritesdocs pushed a commit to lockewritesdocs/elasticsearch that referenced this pull request Oct 28, 2021
This PR adds support for `cosine` similarity:

```
"mappings": {
  "properties": {
    "my_vector": {
      "type": "dense_vector",
      "dims": 128,
      "index": true,
      "similarity": "cosine"
    }
  }
}
```

Unlike `dot_product`, which requires vectors to be of unit length, this
similarity can handle vectors with any magnitude.

This PR also adds validation around `dot_product` to help catch mistakes. When
indexing vectors, we double-check that each vector has unit length. We also
check that kNN query vectors have unit length.
@jtibshirani jtibshirani added :Search Relevance/Vectors Vector search and removed :Search/Search Search-related issues that do not fall into other categories labels Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants