Skip to content

Fix bulk scoring to process last batch instead of falling through to scalar tail#145316

Merged
ldematte merged 4 commits intoelastic:mainfrom
ldematte:native/fix-bulk-last-batch
Mar 31, 2026
Merged

Fix bulk scoring to process last batch instead of falling through to scalar tail#145316
ldematte merged 4 commits intoelastic:mainfrom
ldematte:native/fix-bulk-last-batch

Conversation

@ldematte
Copy link
Copy Markdown
Contributor

@ldematte ldematte commented Mar 31, 2026

This PR fixes a small issue in bulk scoring functions where the last batch of vectors was unnecessarily dropped to the single-vector tail loop.

Bulk loops used c + 2 * batches - 1 < count as the loop condition, which exits when there aren't enough vectors for both the current batch AND a next batch to prefetch. This means the last full batch (where there's no next batch to prefetch) was always processed one-by-one in the scalar tail.

This PR changes the loop condition to c + batches - 1 < count (process all full batches), and guard the prefetch with const bool has_next = c + 2 * batches - 1 < count. This pattern was already used in vec_i4_2.cpp (AVX-512 int4) — now applied consistently everywhere.

Also fixes > to >= in SIMD stride checks across all files, so that when dims equals exactly the stride length, we use the SIMD path instead of falling through to scalar.

Relates to #145411

Test plan

  • JDKVectorLibrary*Tests pass locally on Apple Silicon (aarch64)
  • JDKVectorLibraryInt8Tests pass on AMD c8a (x64 AVX-512)

@elasticsearchmachine elasticsearchmachine added v9.4.0 needs:triage Requires assignment of a team area label labels Mar 31, 2026
@ldematte ldematte added :Search Relevance/Vectors Vector search >non-issue and removed needs:triage Requires assignment of a team area label labels Mar 31, 2026
@ldematte ldematte requested review from ChrisHegarty and thecoop and removed request for thecoop March 31, 2026 11:38
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Mar 31, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@ldematte ldematte requested a review from a team as a code owner March 31, 2026 11:44
Copy link
Copy Markdown
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ldematte ldematte enabled auto-merge (squash) March 31, 2026 13:34
@ldematte ldematte merged commit 1b0c52d into elastic:main Mar 31, 2026
35 checks passed
@ldematte ldematte deleted the native/fix-bulk-last-batch branch March 31, 2026 14:01
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 31, 2026
…rics

* upstream/main: (21 commits)
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:external-basic.topSnippetsFunction} elastic#145353
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:external-basic.scoreFunction} elastic#145352
  [DiskBBQ] Fix bug in NeighborQueue#popRawAndAddRaw (elastic#145324)
  Fix dense_vector default index options when using BFLOAT16 (elastic#145202)
  Use checked exceptions in entitlement constructor rules (elastic#145234)
  ESQL: DS: datasource file plugins should not return TEXT types (elastic#145334)
  Plumb DLM error store through to DlmFrozenTransition classes (elastic#145243)
  Make Settings.Builder.remove() fluent (elastic#145294)
  Add FLS tests for METRICS_INFO and TS_INFO (elastic#145211)
  Fix flaky SecurityFeatureResetTests (elastic#145063)
  [DOCS] Fix conflict markers in ESQL processing command list (elastic#145338)
  Skip certain metric assertions on Windows (elastic#144933)
  [ES|QL] Add schema reconciliation for multi-file external sources (elastic#145220)
  Simplify DiskBBQ dynamic visit ratio to linear (elastic#142784)
  ESQL: Disallow unmapped_fields=load with partial non-KEYWORD (elastic#144109)
  [Transform] Track Linked Projects (elastic#144399)
  Fix bulk scoring to process last batch instead of falling through to scalar tail (elastic#145316)
  Clean up TickerScheduleEngineTests (elastic#145303)
  [CI] ShardBulkInferenceActionFilterIT testRestart - Ensuring that secrets-inference index is available after full restart and unmuting test (elastic#145317)
  Add CRUD doc to the DistributedArchitectureGuide (elastic#144710)
  ...
ncordon pushed a commit to ncordon/elasticsearch that referenced this pull request Apr 1, 2026
…scalar tail (elastic#145316)

This PR fixes a small issue in bulk scoring functions where the last batch of vectors was unnecessarily dropped to the single-vector tail loop.

Bulk loops used c + 2 * batches - 1 < count as the loop condition, which exits when there aren't enough vectors for both the current batch AND a next batch to prefetch. This means the last full batch (where there's no next batch to prefetch) was always processed one-by-one in the scalar tail.

This PR changes the loop condition to c + batches - 1 < count (process all full batches), and guard the prefetch with const bool has_next = c + 2 * batches - 1 < count. This pattern was already used in vec_i4_2.cpp (AVX-512 int4) — now applied consistently everywhere.

Also fixes > to >= in SIMD stride checks across all files, so that when dims equals exactly the stride length, we use the SIMD path instead of falling through to scalar.
mromaios pushed a commit to mromaios/elasticsearch that referenced this pull request Apr 9, 2026
…scalar tail (elastic#145316)

This PR fixes a small issue in bulk scoring functions where the last batch of vectors was unnecessarily dropped to the single-vector tail loop.

Bulk loops used c + 2 * batches - 1 < count as the loop condition, which exits when there aren't enough vectors for both the current batch AND a next batch to prefetch. This means the last full batch (where there's no next batch to prefetch) was always processed one-by-one in the scalar tail.

This PR changes the loop condition to c + batches - 1 < count (process all full batches), and guard the prefetch with const bool has_next = c + 2 * batches - 1 < count. This pattern was already used in vec_i4_2.cpp (AVX-512 int4) — now applied consistently everywhere.

Also fixes > to >= in SIMD stride checks across all files, so that when dims equals exactly the stride length, we use the SIMD path instead of falling through to scalar.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants