Expand the bulk vector search micro benchmarks to include dataset larger than typical cache sizes.

In many of our vector search benchmark we time pure compute - timing the vector operation when both the vectors are in CPU cache.  In many scenarios, e.g. HNSW, we score vectors that may not be in cache. In these scenarios it may be better to reflow the bulk scorer to small batches (say 4) vectors at a time, rather than aggressive unrolling per single-vector. If we do this we can improve the memory-level parallelism of the complete bulk operation. We've already seen this improve float32 vector ops in Lucene.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand the bulk vector search micro benchmarks to include dataset larger than typical cache sizes. #138358

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Expand the bulk vector search micro benchmarks to include dataset larger than typical cache sizes. #138358

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions