MB-65473: Refactor and Optimize Pre-Filtered Vector Search#41
Merged
CascadingRadium merged 9 commits intomasterfrom Apr 1, 2025
Merged
MB-65473: Refactor and Optimize Pre-Filtered Vector Search#41CascadingRadium merged 9 commits intomasterfrom
CascadingRadium merged 9 commits intomasterfrom
Conversation
…and reduce memory footprint.
This was referenced Mar 25, 2025
metonymic-smokey
previously approved these changes
Mar 27, 2025
There was a problem hiding this comment.
Pull Request Overview
This PR refactors and optimizes the pre-filtered vector search implementation, consolidating IVF handling logic and updating related index interface methods. Key changes include:
- Replacing the separate NewSearchParamsIVF function with a unified NewSearchParams that accepts an optional default parameters pointer.
- Introducing the IsIVFIndex method and refactoring cluster-related methods to better reflect IVF-specific operations.
- Adjusting error handling and resource clean-up patterns by enforcing the deletion of allocated resources even on errors.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| search_params.go | Refactored search parameters construction, unified IVF/non-IVF handling. |
| index.go | Updated IVF index interface methods and adjusted search function calls. |
Comments suppressed due to low confidence (2)
search_params.go:64
- Consider including the returned error code 'c' in the error message to aid debugging (e.g., 'failed to create faiss search params, code: %d').
if c := C.faiss_SearchParameters_new(&rv.sp, sel); c != 0 {
index.go:158
- [nitpick] Consider using a more descriptive variable name than 'rv', such as 'clusterCounts', to improve code readability.
rv := make(map[int64]int64, len(vecIDs))
Thejas-bhat
reviewed
Mar 27, 2025
Thejas-bhat
approved these changes
Mar 27, 2025
Thejas-bhat
requested changes
Mar 28, 2025
abhinavdangeti
approved these changes
Mar 31, 2025
Thejas-bhat
approved these changes
Apr 1, 2025
abhinavdangeti
added a commit
to blevesearch/zapx
that referenced
this pull request
Apr 1, 2025
- Refactor pre-filtered vector search to enhance performance and reduce memory footprint. - Replace the current bitmap-based cluster selection mechanism with a simpler approach that uses the DirectMap in the IVF index. The IVF index's DirectMap directly maps the vector ID to the cluster it belongs to. - Make `github.com/bits-and-blooms/bitset` a direct dependency of `zapx` and upgrade it to the latest version - Requires blevesearch/go-faiss#41 --------- Co-authored-by: Abhinav Dangeti <abhinav@couchbase.com>
abhinavdangeti
added a commit
to blevesearch/bleve
that referenced
this pull request
Apr 2, 2025
- Refactor pre-filtered vector search to enhance performance and reduce
memory footprint.
- Replace the current bitmap-based approach for calculating segment
local document numbers with a more direct method, where the local
document numbers are mapped directly to the segment ID during the
execution of the eligible collector.
- Requires:
- blevesearch/bleve_index_api#63
- blevesearch/bleve_index_api#66
- blevesearch/zapx#317
- blevesearch/go-faiss#41
- blevesearch/faiss#49
---------
Co-authored-by: Abhinav Dangeti <abhinav@couchbase.com>
CascadingRadium
added a commit
to blevesearch/zapx
that referenced
this pull request
Apr 7, 2025
- Refactor pre-filtered vector search to enhance performance and reduce memory footprint. - Replace the current bitmap-based cluster selection mechanism with a simpler approach that uses the DirectMap in the IVF index. The IVF index's DirectMap directly maps the vector ID to the cluster it belongs to. - Make `github.com/bits-and-blooms/bitset` a direct dependency of `zapx` and upgrade it to the latest version - Requires blevesearch/go-faiss#41 --------- Co-authored-by: Abhinav Dangeti <abhinav@couchbase.com>
CascadingRadium
added a commit
to blevesearch/bleve
that referenced
this pull request
Apr 7, 2025
- Refactor pre-filtered vector search to enhance performance and reduce
memory footprint.
- Replace the current bitmap-based approach for calculating segment
local document numbers with a more direct method, where the local
document numbers are mapped directly to the segment ID during the
execution of the eligible collector.
- Requires:
- blevesearch/bleve_index_api#63
- blevesearch/bleve_index_api#66
- blevesearch/zapx#317
- blevesearch/go-faiss#41
- blevesearch/faiss#49
---------
Co-authored-by: Abhinav Dangeti <abhinav@couchbase.com>
abhinavdangeti
added a commit
to blevesearch/zapx
that referenced
this pull request
Apr 7, 2025
#320) - Refactor pre-filtered vector search to enhance performance and reduce memory footprint. - Replace the current bitmap-based cluster selection mechanism with a simpler approach that uses the DirectMap in the IVF index. The IVF index's DirectMap directly maps the vector ID to the cluster it belongs to. - Make `github.com/bits-and-blooms/bitset` a direct dependency of `zapx` and upgrade it to the latest version - Requires blevesearch/go-faiss#41 --------- Co-authored-by: Abhinav Dangeti <abhinav@couchbase.com>
abhinavdangeti
added a commit
to blevesearch/bleve
that referenced
this pull request
Apr 8, 2025
… (#2175) - Refactor pre-filtered vector search to enhance performance and reduce memory footprint. - Replace the current bitmap-based approach for calculating segment local document numbers with a more direct method, where the local document numbers are mapped directly to the segment ID during the execution of the eligible collector. - Requires: - blevesearch/bleve_index_api#67 - blevesearch/zapx#320 - blevesearch/go-faiss#41 - blevesearch/faiss#49 --------- --------- Co-authored-by: Abhinav Dangeti <abhinav@couchbase.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
ObtainClusterVectorCountsFromIVFIndexAPI to return cluster vector counts for given vector IDs.SearchClustersFromIVFIndexAPI to remove the unused nvecs value and the Nvecs attribute fromdefaultSearchParamsIVF.NewSearchParamsto acceptdefaultSearchParamsIVFinstead.