Skip to content

Add JDK22+ heap-backed native vector scorer suppliers#142812

Open
arup-chauhan wants to merge 2 commits intoelastic:mainfrom
arup-chauhan:native-scorers-hnsw-build
Open

Add JDK22+ heap-backed native vector scorer suppliers#142812
arup-chauhan wants to merge 2 commits intoelastic:mainfrom
arup-chauhan:native-scorers-hnsw-build

Conversation

@arup-chauhan
Copy link
Copy Markdown

Description

This PR implements an Elasticsearch-first fix for #142379 by enabling native vector scorer suppliers during the array-backed phase of HNSW graph building (JDK 22+), while preserving existing off-heap paths and Lucene fallback behavior.

Context from issue discussion:

  • During initial HNSW build, vectors may come from heap arrays (vectorValue), so the existing index-slice/off-heap native supplier path is not always used.
  • We add a JDK22+ heap-backed MemorySegment supplier path in Elasticsearch first, as requested in the issue thread.

Changes

  1. Extended VectorScorerFactory API for array-backed suppliers:
  • getFloatVectorScorerSupplier(VectorSimilarityType, FloatVectorValues)
  • getByteVectorScorerSupplier(VectorSimilarityType, ByteVectorValues)
  1. Implemented array-backed native suppliers in simdvec:
  • Heap float supplier
  • Heap byte supplier
  1. Wired factory implementation:
  • VectorScorerFactoryImpl now returns heap-backed suppliers for array-backed values.
  1. Updated ES scorer selection path:
  • ES93FlatVectorScorer now tries:
    • index-slice native supplier (existing behavior)
    • array-backed native supplier (new behavior)
    • Lucene fallback supplier
  1. Added test coverage for the new array-backed path:
  • FloatVectorScorerFactoryTests.testArrayBackedRandomSupplier
  • ByteVectorScorerFactoryTests.testArrayBackedRandomSupplier

Behavior / Safety

  • New array-backed native supplier path is explicitly gated to JDK 22+ (Runtime.version().feature() >= 22).
  • If unsupported/incompatible, behavior falls back to existing Lucene scorer path.
  • Existing off-heap/index-slice path remains unchanged.

Validation

Ran with runtime JDK 25 (JDK22+ path active):

./gradlew :libs:simdvec:compileMain21Java :server:compileJava
./gradlew :libs:simdvec:test \
  --tests org.elasticsearch.simdvec.FloatVectorScorerFactoryTests.testArrayBackedRandomSupplier \
  --tests org.elasticsearch.simdvec.ByteVectorScorerFactoryTests.testArrayBackedRandomSupplier
./gradlew :server:test \
  --tests org.elasticsearch.index.mapper.vectors.DenseVectorFieldMapperTests.testKnnQuantizedFlatVectorsFormat \
  --tests org.elasticsearch.index.mapper.vectors.DenseVectorFieldMapperTests.testKnnQuantizedHNSWVectorsFormat
./gradlew :libs:simdvec:spotlessApply
./gradlew :libs:simdvec:spotlessJavaCheck :server:spotlessJavaCheck

All above commands completed successfully.

@elasticsearchmachine elasticsearchmachine added v9.4.0 needs:triage Requires assignment of a team area label external-contributor Pull request authored by a developer outside the Elasticsearch team labels Feb 22, 2026
Copy link
Copy Markdown
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. please benchmark
  2. bulk scoring actually needs to be bulk scoring

}

@Override
HeapByteVectorScorerSupplier copyInternal() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why doesn't this just override copy directly?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thecoop I removed the copyInternal() indirection and now each concrete heap supplier overrides copy() directly

@thecoop thecoop added :Search Relevance/Vectors Vector search and removed needs:triage Requires assignment of a team area label labels Feb 23, 2026
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Feb 23, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@arup-chauhan
Copy link
Copy Markdown
Author

  1. please benchmark
  2. bulk scoring actually needs to be bulk scoring

@benwtrent thanks, this is addressed now.

Bulk scoring is now truly bulk. I updated the heap-backed path so bulkScore(...) no longer loops over score(...) one-by-one.

It now packs the selected vectors into a contiguous buffer, calls the native bulk functions, and then applies the similarity-specific normalization step.

I also ran a focused indexing benchmark with qa/vector (HNSW, byte vectors, 128 dims, 100k docs):

  • JDK21: 2336 ms
  • JDK25: 1668 ms
  • about 28.6% faster

While running this, I also found a bug in ordinal handling during incremental HNSW build (we were effectively treating values.size() as fixed). I fixed that by checking
ordinals against the current values.size() at runtime.

Signed-off-by: Arup Chauhan <arupchauhan.connect@gmail.com>
Signed-off-by: Arup Chauhan <arupchauhan.connect@gmail.com>
@arup-chauhan arup-chauhan force-pushed the native-scorers-hnsw-build branch from ae0440b to 50955d4 Compare March 2, 2026 11:15
@ldematte
Copy link
Copy Markdown
Contributor

ldematte commented Mar 2, 2026

Hello @arup-chauhan, thanks for the benchmarks. Can you add some more details on how you run them?
I do not question your numbers, but I got opposite results when I tried an approach similar to yours, where indexing times got much worse (a 50% or more increase in indexing times wrt the default Lucene implementation).
My benchmarks were different though - I used float32 and higher dimensions, so vectors where definitely larger (probably x16 times larger than the ones you used), which might explain the big difference. Also, I run benchmarks on ARM.

@arup-chauhan
Copy link
Copy Markdown
Author

arup-chauhan commented Mar 2, 2026

Hey @ldematte, thanks for checking this.

You’re right that my run is not directly comparable to yours. Here is exactly what I ran:

Command:
./gradlew :qa:vector:checkVec --args="qa/vector/configs/my-config.json"

Config:
{
"doc_vectors": ["target/knn_data/docs-128d-120k.bvec"],
"num_docs": 100000,
"index_type": "hnsw",
"hnsw_m": 16,
"hnsw_ef_construction": 200,
"vector_encoding": "byte",
"dimensions": -1,
"reindex": true
}

Results (indexing only):

  • previous run: doc_add_time=1635ms, total_index_time=3753ms
  • v25 run: doc_add_time=351ms, total_index_time=2268ms

Here is my hardware:

  • CPU: Apple M4
  • Arch: arm64
  • Cores: 10 (4P + 6E)
  • Memory: 16 GB
  • Java: 25.0.2 (LTS) and Java 21
  • Branch: native-scorers-hnsw-build

This is BYTE vectors, 128 dims, 100k docs, and search/query was not executed in this config (search metrics are zero). So this does not cover float32 + higher dimensions, where behavior can differ significantly.

I agree ARM + larger float vectors may change the outcome materially.

@benwtrent
Copy link
Copy Markdown
Member

@arup-chauhan isn't it obvious that you need to benchmark with float and byte? you only did byte.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants