Generalize BBQ and INT4 bulk templates with TData parameter#145459
Merged
ChrisHegarty merged 13 commits intoelastic:mainfrom Apr 3, 2026
Merged
Generalize BBQ and INT4 bulk templates with TData parameter#145459ChrisHegarty merged 13 commits intoelastic:mainfrom
ChrisHegarty merged 13 commits intoelastic:mainfrom
Conversation
Add a TData template parameter to the BBQ (dotd1q4, dotd2q4, dotd4q4) and INT4 (doti4) bulk scoring templates on both amd64 and aarch64 tier-1. This aligns them with call_i8_bulk in vec_1.cpp, which already uses TData to support sequential_mapper, offsets_mapper, and sparse_mapper through the same template. No functional change — existing sequential and offsets instantiations are updated to pass int8_t as TData explicitly.
Collaborator
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
thecoop
approved these changes
Apr 1, 2026
… bbq_int4_template_refactor
ChrisHegarty
added a commit
that referenced
this pull request
Apr 3, 2026
This PR adds BULK_SPARSE native exports for BBQ (d1q4, d2q4, d4q4) and packed INT4 vector dot-product operations on both amd64 and aarch64. This fills out BULK_SPARSE support for these element types, consistent with INT7U and INT8 which already have sparse operations. Unlike BULK_OFFSETS, which requires all vectors to reside in a single contiguous memory region, BULK_SPARSE accepts an array of independent memory addresses, one per vector. This enables efficient bulk scoring over scatter-gather data, such as when vectors are backed by DirectAccessInput with its 16MiB region boundaries. The new native functions use the sparse_mapper with the generalized TData bulk templates introduced in #145459. JdkVectorLibrary is updated to enable BULK_SPARSE for BBQ and INT4 with appropriate bounds checking, and Similarities gains corresponding method handles and Java wrapper methods. Unit tests cover contiguous slices, scattered (non-contiguous) allocations, and illegal argument validation for both BBQ (parameterized across d1q4, d2q4, d4q4) and INT4.
mromaios
pushed a commit
to mromaios/elasticsearch
that referenced
this pull request
Apr 9, 2026
…145459) Follow-up to elastic#144634. Extends the TData mapper template generalization to the BBQ (dotd1q4, dotd2q4, dotd4q4) and INT4 (doti4) bulk scoring templates on both amd64 and aarch64 tier-1, which were not covered by the original refactoring. This is a pure refactor with no functional change. The existing sequential_mapper and offsets_mapper instantiations now explicitly pass int8_t as TData rather than hardcoding const int8_t* in the template signature.
mromaios
pushed a commit
to mromaios/elasticsearch
that referenced
this pull request
Apr 9, 2026
…5676) This PR adds BULK_SPARSE native exports for BBQ (d1q4, d2q4, d4q4) and packed INT4 vector dot-product operations on both amd64 and aarch64. This fills out BULK_SPARSE support for these element types, consistent with INT7U and INT8 which already have sparse operations. Unlike BULK_OFFSETS, which requires all vectors to reside in a single contiguous memory region, BULK_SPARSE accepts an array of independent memory addresses, one per vector. This enables efficient bulk scoring over scatter-gather data, such as when vectors are backed by DirectAccessInput with its 16MiB region boundaries. The new native functions use the sparse_mapper with the generalized TData bulk templates introduced in elastic#145459. JdkVectorLibrary is updated to enable BULK_SPARSE for BBQ and INT4 with appropriate bounds checking, and Similarities gains corresponding method handles and Java wrapper methods. Unit tests cover contiguous slices, scattered (non-contiguous) allocations, and illegal argument validation for both BBQ (parameterized across d1q4, d2q4, d4q4) and INT4.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #144634. Extends the TData mapper template generalization to the BBQ (dotd1q4, dotd2q4, dotd4q4) and INT4 (doti4) bulk scoring templates on both amd64 and aarch64 tier-1, which were not covered by the original refactoring.
This is a pure refactor with no functional change. The existing
sequential_mapperandoffsets_mapperinstantiations now explicitly passint8_tasTDatarather than hardcodingconst int8_t*in the template signature.