Skip to content

Generalize BBQ and INT4 bulk templates with TData parameter#145459

Merged
ChrisHegarty merged 13 commits intoelastic:mainfrom
ChrisHegarty:bbq_int4_template_refactor
Apr 3, 2026
Merged

Generalize BBQ and INT4 bulk templates with TData parameter#145459
ChrisHegarty merged 13 commits intoelastic:mainfrom
ChrisHegarty:bbq_int4_template_refactor

Conversation

@ChrisHegarty
Copy link
Copy Markdown
Contributor

Follow-up to #144634. Extends the TData mapper template generalization to the BBQ (dotd1q4, dotd2q4, dotd4q4) and INT4 (doti4) bulk scoring templates on both amd64 and aarch64 tier-1, which were not covered by the original refactoring.

This is a pure refactor with no functional change. The existing sequential_mapper and offsets_mapper instantiations now explicitly pass int8_t as TData rather than hardcoding const int8_t* in the template signature.

Add a TData template parameter to the BBQ (dotd1q4, dotd2q4, dotd4q4)
and INT4 (doti4) bulk scoring templates on both amd64 and aarch64 tier-1.
This aligns them with call_i8_bulk in vec_1.cpp, which already uses TData
to support sequential_mapper, offsets_mapper, and sparse_mapper through
the same template. No functional change — existing sequential and offsets
instantiations are updated to pass int8_t as TData explicitly.
@ChrisHegarty ChrisHegarty added >refactoring :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Apr 1, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@ChrisHegarty ChrisHegarty requested a review from a team as a code owner April 1, 2026 15:55
Copy link
Copy Markdown
Contributor

@ldematte ldematte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ChrisHegarty ChrisHegarty added the test-arm Pull Requests that should be tested against arm agents label Apr 2, 2026
@ChrisHegarty ChrisHegarty merged commit 26c4815 into elastic:main Apr 3, 2026
41 checks passed
@ChrisHegarty ChrisHegarty deleted the bbq_int4_template_refactor branch April 3, 2026 10:55
ChrisHegarty added a commit that referenced this pull request Apr 3, 2026
This PR adds BULK_SPARSE native exports for BBQ (d1q4, d2q4, d4q4) and packed INT4 vector dot-product operations on both amd64 and aarch64. This fills out BULK_SPARSE support for these element types, consistent with INT7U and INT8 which already have sparse operations.

Unlike BULK_OFFSETS, which requires all vectors to reside in a single contiguous memory region, BULK_SPARSE accepts an array of independent memory addresses, one per vector. This enables efficient bulk scoring over scatter-gather data, such as when vectors are backed by DirectAccessInput with its 16MiB region boundaries.

The new native functions use the sparse_mapper with the generalized TData bulk templates introduced in #145459. JdkVectorLibrary is updated to enable BULK_SPARSE for BBQ and INT4 with appropriate bounds checking, and Similarities gains corresponding method handles and Java wrapper methods. Unit tests cover contiguous slices, scattered (non-contiguous) allocations, and illegal argument validation for both BBQ (parameterized across d1q4, d2q4, d4q4) and INT4.
mromaios pushed a commit to mromaios/elasticsearch that referenced this pull request Apr 9, 2026
…145459)

Follow-up to elastic#144634. Extends the TData mapper template generalization to the BBQ (dotd1q4, dotd2q4, dotd4q4) and INT4 (doti4) bulk scoring templates on both amd64 and aarch64 tier-1, which were not covered by the original refactoring.

This is a pure refactor with no functional change. The existing sequential_mapper and offsets_mapper instantiations now explicitly pass int8_t as TData rather than hardcoding const int8_t* in the template signature.
mromaios pushed a commit to mromaios/elasticsearch that referenced this pull request Apr 9, 2026
…5676)

This PR adds BULK_SPARSE native exports for BBQ (d1q4, d2q4, d4q4) and packed INT4 vector dot-product operations on both amd64 and aarch64. This fills out BULK_SPARSE support for these element types, consistent with INT7U and INT8 which already have sparse operations.

Unlike BULK_OFFSETS, which requires all vectors to reside in a single contiguous memory region, BULK_SPARSE accepts an array of independent memory addresses, one per vector. This enables efficient bulk scoring over scatter-gather data, such as when vectors are backed by DirectAccessInput with its 16MiB region boundaries.

The new native functions use the sparse_mapper with the generalized TData bulk templates introduced in elastic#145459. JdkVectorLibrary is updated to enable BULK_SPARSE for BBQ and INT4 with appropriate bounds checking, and Similarities gains corresponding method handles and Java wrapper methods. Unit tests cover contiguous slices, scattered (non-contiguous) allocations, and illegal argument validation for both BBQ (parameterized across d1q4, d2q4, d4q4) and INT4.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>refactoring :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch test-arm Pull Requests that should be tested against arm agents v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants