Skip to content

Avoid extra roundtrips in ANN search #88921

@jtibshirani

Description

@jtibshirani

We recently folded ANN search into the _search endpoint and deprecated the _knn_search endpoint (#87625). The top k vector results represent the global nearest neighbors across all shards. To support this, we use the DFS phase to collect the top k vector matches per shard, combine them and keep the global top k to pass on to the query phase.

This means we perform an extra roundtrip compared to the old _knn_search API. We should optimize some common cases to remove this roundtrip:

  • When there is only ANN search and no query or aggs, could we just use the query phase to compute the top k nearest neighbors and avoid using a DFS phase?
  • When there's a single shard, maybe we could also just use the query phase. We would add in the ANN search as a boolean clause on the main query. This would resolve the following TODO: Add 'knn' section to search endpoint #88002 (comment).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions