Move filtering logic from RawScorer to FilteredScorer#6245
Conversation
9bf697e to
0586a80
Compare
📝 WalkthroughWalkthroughThis change set performs a comprehensive refactor of scoring and filtering logic throughout the vector search segment, focusing on the consolidation and simplification of scorer construction and usage. The core change is the redesign of the All usages of Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms (13)
🔇 Additional comments (6)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🧹 Nitpick comments (6)
lib/segment/src/index/hnsw_index/tests/mod.rs (1)
41-44: Refactored to separate scoring and filteringThe code now retrieves both a raw scorer and a filterer using
get_scorer_and_filterer, replacing the previous approach withFakeFilterContextandFilteredScorer. This change is consistent with the PR's objective of decoupling scoring and filtering mechanisms.The
clone()onadded_vectormight be unnecessary since the vector is already passed by value. Consider reviewing if this clone operation is required:- let (raw_scorer, filterer) = vector_holder - .get_scorer_and_filterer(added_vector.clone()) - .unwrap(); + let (raw_scorer, filterer) = vector_holder + .get_scorer_and_filterer(added_vector) + .unwrap();lib/segment/src/vector_storage/dense/memmap_dense_vector_storage.rs (1)
677-681: Improved result vector preparation with new_unset constructorThe implementation now uses a more direct approach to prepare the result vector by mapping over query points and using the new
ScoredPointOffset::new_unset()constructor. This eliminates the need for vector resizing and makes the code more readable and efficient.Consider adding a comment explaining that using
new_unset()initializes the score with a placeholder value (0.0) that will be updated byscore_points()to help future developers understand the flow.let mut res = query_points .iter() .map(|&idx| ScoredPointOffset::new_unset(idx)) .collect::<Vec<_>>(); + // The scores are initialized with placeholder values and will be updated by score_points scorer.score_points(&mut res);lib/segment/src/vector_storage/async_raw_scorer.rs (1)
73-88: Refinedscore_pointsAPI
Nowscore_pointsdirectly updates the scores array in place, simplifying the interface. The early return onis_stoppedis good for responsiveness. UsingCellfor in-place mutation in the closure is well-implemented.Consider providing a fallback strategy if
io_uringis not supported, e.g. logging a warning or using the synchronous path, to avoid unexpected panic in production.lib/segment/src/index/hnsw_index/point_filterer.rs (1)
58-69: Fix spelling in documentation.The words performace and indended are misspelled in these lines.
Please apply this diff to correct them:
-/// This method operates on [`ScoredPointOffset`] for performace reasons. -/// The indended usage is to prepare a vector of [`ScoredPointOffset`] +/// This method operates on [`ScoredPointOffset`] for performance reasons. +/// The intended usage is to prepare a vector of [`ScoredPointOffset`]🧰 Tools
🪛 GitHub Check: Check for spelling errors
[failure] 61-61:
performace ==> performance
[failure] 62-62:
indended ==> intended, indented🪛 GitHub Actions: Codespell
[warning] 61-61: codespell: 'performace' is a misspelling. Suggested correction: 'performance'.
lib/segment/benches/hnsw_build_asymptotic.rs (1)
35-35: Handle potential panic on.unwrap().Each of these lines calls
.unwrap(), which will panic ifget_scorer_and_filtererreturns an error. While it may be acceptable in benchmarking code, using.expect("<message>")or an error-handling strategy can provide clearer diagnostics when failures occur.- let (raw_scorer, filterer) = vector_holder.get_scorer_and_filterer(query).unwrap(); + let (raw_scorer, filterer) = vector_holder + .get_scorer_and_filterer(query) + .expect("Failed to retrieve scorer and filterer");Also applies to: 56-56, 63-63, 72-72, 80-80, 94-94, 111-111, 127-127, 143-143
lib/segment/src/index/hnsw_index/graph_layers.rs (1)
425-425: Avoid potential panic on.unwrap().Using
.unwrap()can cause unwanted panics ifget_scorer_and_filtererfails. Consider a more descriptive.expect()or a proper error-handling approach to make failures more explicit.- let (raw_scorer, filterer) = vector_holder.get_scorer_and_filterer(added_vector).unwrap(); + let (raw_scorer, filterer) = vector_holder + .get_scorer_and_filterer(added_vector) + .expect("Failed to retrieve scorer and filterer");
🛑 Comments failed to post (1)
lib/segment/benches/hnsw_build_graph.rs (1)
31-33: 💡 Verification agent
🧩 Analysis chain
Refactored to separate scoring and filtering
The code now retrieves both a raw scorer and a filterer directly from the vector holder, replacing the previous approach of using
FakeFilterContextandFilteredScorer. This change aligns with the PR's objective of decoupling scoring and filtering mechanisms.Consider handling potential errors from
get_scorer_and_filterermore gracefully instead of usingunwrap(), which could cause panics in case of failure:
🏁 Script executed:
#!/bin/bash # Search for other instances of error handling with get_scorer_and_filterer rg "get_scorer_and_filterer.*unwrap" --type rustLength of output: 2698
Enhance Error Handling for
get_scorer_and_filtererThe refactor effectively decouples scoring and filtering by retrieving both components directly from the vector holder. However, our search indicates that using
unwrap()withget_scorer_and_filtereris pervasive across the codebase (including in this file, e.g., inlib/segment/benches/hnsw_build_graph.rson lines 31–33, and elsewhere). This approach could lead to panics if the function ever fails.
- Action Items:
- Review and refactor error handling for
get_scorer_and_filtererboth here and in other affected files (e.g., inlib/segment/src/index/hnsw_index/graph_layers.rsandlib/segment/src/index/hnsw_index/graph_layers_builder.rs).- Consider returning more informative errors or using methods like
expect()with a clear message, or refactor to propagate errors gracefully.
| Some(f) => scores | ||
| .retain(|score| self.simple_filterer.check_vector(score.idx) && f.check(score.idx)), | ||
| } | ||
| if limit != 0 { |
There was a problem hiding this comment.
Shall we put it at the beginning of the function body without the condition?
There was a problem hiding this comment.
Relevant code in the current version of this PR:
qdrant/lib/segment/src/index/hnsw_index/point_scorer.rs
Lines 133 to 141 in e2bce5a
limit == 0 means "don't truncate scores". I also don't like, but this logic was there before this PR. I've added a comment in FilteredScorer::score_points about this.
(maybe limit: usize can be replaced with Option<NonZeroUsize>, but I don't want more bloat in this PR)
Shall we put it at the beginning of the function body without the condition?
No, retain-then-retain may produce a different result than truncate-then-retain.
|
After review in-call:
FIlteredScorer uses RawScorer to combine into a single call If RawScorer is needed without filters, it can be extracted (used from) FilteredScorer
For a separate PR:
|
There was a problem hiding this comment.
Copilot reviewed 23 out of 23 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
lib/segment/src/vector_storage/raw_scorer.rs:941
- [nitpick] Consider renaming this method (or updating its documentation) to emphasize that it fills an existing scores buffer in-place instead of returning a new vector; this would clarify that the scores slice must be pre-initialized (e.g. using ScoredPointOffset::new_unset).
fn score_points(&self, scores: &mut [ScoredPointOffset]) {
lib/segment/src/fixtures/index_fixtures.rs:170
- [nitpick] Consider refactoring the repeated pattern of obtaining a RawScorer and its corresponding PointsFilterer into a dedicated helper function or utility to reduce boilerplate at call sites.
pub fn get_scorer_and_filterer(
| Some(f) => scores | ||
| .retain(|score| self.simple_filterer.check_vector(score.idx) && f.check(score.idx)), | ||
| } | ||
| if limit != 0 { |
There was a problem hiding this comment.
Shall we put it at the beginning of the function body without the condition?
0586a80 to
972a598
Compare
972a598 to
e2bce5a
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (6)
lib/common/common/src/iterator_ext/mod.rs (1)
47-55: Well-implemented benchmarking utility!The
black_boxmethod is a good addition for benchmarking purposes. It correctly appliesstd::hint::black_boxto each item in the iterator to prevent compiler optimizations during benchmarks.If you prefer a more concise style, the closure could be simplified to:
- self.for_each(|p| { - std::hint::black_box(p); - }); + self.for_each(std::hint::black_box);However, the current implementation is explicit and clear, so this is purely optional.
lib/segment/src/index/sparse_index/sparse_vector_index.rs (2)
302-309: Leverage quantized vectors if present for cheaper scoring
FilteredScorer::newaccepts an optionalquantized_vectorsparameter, but we always passNone.
When the underlying storage has a pre‑built quantized representation, using it typically reduces RAM traffic and SIMD cost during scoring.- let scorer = FilteredScorer::new( - query_vector.clone(), - &vector_storage, - None, - None, - deleted_point_bitslice, - vector_query_context.hardware_counter(), - )?; + let scorer = FilteredScorer::new( + query_vector.clone(), + &vector_storage, + vector_storage.quantized_vectors(), // <- returns `Option<&QuantizedVectors>` + None, + deleted_point_bitslice, + vector_query_context.hardware_counter(), + )?;(If
quantized_vectors()is not available onVectorStorageEnum, you can expose it behind a helper.)
This is a pure optimisation; functional behaviour remains identical.
318-323: Passing the iterator by reference is unnecessary
peek_top_itertakes its iterator by value, and&mut IimplementsIteratoronly indirectly.
Simply move the iterator to the call‑site for slightly cleaner code and avoid the extra reference indirection:- let res = scorer.peek_top_iter(&mut filtered_points, top, &is_stopped)?; + let res = scorer.peek_top_iter(filtered_points, top, &is_stopped)?;lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (1)
52-83: Heavy match‑cascade could be centralisedThe triple‑nested
matchon(datatype, distance)duplicates very similar branches 18 times.
Consider a helper such asfn route<TElement, TMetric>(self, builder: fn(Self) -> OperationResult<Box<dyn RawScorer<'a>>>)or a small macro to collapse repetition.
Less duplication eases future distance additions (e.g.CosineSquared) and reduces the chance of forgetting one of the 18 arms.lib/segment/src/index/hnsw_index/point_scorer.rs (2)
143-152: Slicescores_bufferto avoid scanning unused entries
zip(&*point_ids, &self.scores_buffer)iterates the entire buffer, although only the firstpoint_ids.len()slots are freshly written.
Use the same slice you gave toscore_points:- std::iter::zip(&*point_ids, &self.scores_buffer) + std::iter::zip(&*point_ids, &self.scores_buffer[..point_ids.len()])Saves needless cache traffic when the buffer had previously grown large.
147-152:scores_buffernever shrinks – potential long‑lived memory spikeThe buffer only ever grows; if one request with 1 M points arrives, every subsequent tiny request will still carry a 1 M‑entry buffer.
Consider resetting it when
point_ids.len()falls under a fraction of current capacity, or allocate fresh per call and reuse via thread‑local pool.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (24)
lib/common/common/src/iterator_ext/mod.rs(1 hunks)lib/segment/benches/fixture.rs(2 hunks)lib/segment/benches/hnsw_build_asymptotic.rs(7 hunks)lib/segment/benches/hnsw_build_graph.rs(2 hunks)lib/segment/benches/hnsw_search_graph.rs(4 hunks)lib/segment/benches/scorer_mmap.rs(2 hunks)lib/segment/benches/vector_search.rs(3 hunks)lib/segment/src/fixtures/index_fixtures.rs(2 hunks)lib/segment/src/index/hnsw_index/gpu/gpu_graph_builder.rs(3 hunks)lib/segment/src/index/hnsw_index/gpu/gpu_insert_context.rs(5 hunks)lib/segment/src/index/hnsw_index/gpu/gpu_vector_storage/tests.rs(1 hunks)lib/segment/src/index/hnsw_index/gpu/mod.rs(3 hunks)lib/segment/src/index/hnsw_index/graph_layers.rs(6 hunks)lib/segment/src/index/hnsw_index/graph_layers_builder.rs(7 hunks)lib/segment/src/index/hnsw_index/hnsw.rs(13 hunks)lib/segment/src/index/hnsw_index/point_scorer.rs(2 hunks)lib/segment/src/index/hnsw_index/tests/mod.rs(2 hunks)lib/segment/src/index/hnsw_index/tests/test_compact_graph_layer.rs(2 hunks)lib/segment/src/index/plain_vector_index.rs(4 hunks)lib/segment/src/index/sparse_index/sparse_vector_index.rs(4 hunks)lib/segment/src/vector_storage/async_raw_scorer.rs(9 hunks)lib/segment/src/vector_storage/dense/memmap_dense_vector_storage.rs(9 hunks)lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs(10 hunks)lib/segment/src/vector_storage/quantized/quantized_vectors.rs(0 hunks)
💤 Files with no reviewable changes (1)
- lib/segment/src/vector_storage/quantized/quantized_vectors.rs
✅ Files skipped from review due to trivial changes (3)
- lib/segment/src/index/hnsw_index/tests/test_compact_graph_layer.rs
- lib/segment/src/index/hnsw_index/graph_layers.rs
- lib/segment/src/index/hnsw_index/graph_layers_builder.rs
🚧 Files skipped from review as they are similar to previous changes (10)
- lib/segment/src/index/hnsw_index/gpu/mod.rs
- lib/segment/benches/hnsw_build_graph.rs
- lib/segment/src/index/hnsw_index/gpu/gpu_insert_context.rs
- lib/segment/src/index/hnsw_index/tests/mod.rs
- lib/segment/src/vector_storage/dense/memmap_dense_vector_storage.rs
- lib/segment/benches/hnsw_search_graph.rs
- lib/segment/src/fixtures/index_fixtures.rs
- lib/segment/benches/hnsw_build_asymptotic.rs
- lib/segment/src/index/hnsw_index/gpu/gpu_graph_builder.rs
- lib/segment/src/index/hnsw_index/hnsw.rs
🧰 Additional context used
🧬 Code Graph Analysis (3)
lib/segment/benches/vector_search.rs (2)
lib/segment/src/vector_storage/dense/simple_dense_vector_storage.rs (1)
open_simple_dense_vector_storage(92-108)lib/segment/src/index/hnsw_index/point_scorer.rs (1)
new_for_test(99-111)
lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (1)
lib/segment/src/vector_storage/raw_scorer.rs (1)
raw_scorer_from_query_scorer(423-434)
lib/segment/src/index/hnsw_index/point_scorer.rs (4)
lib/segment/src/common/operation_error.rs (1)
check_process_stopped(242-247)lib/segment/src/vector_storage/raw_scorer.rs (2)
check_deleted_condition(736-747)new_raw_scorer(51-121)lib/segment/src/vector_storage/quantized/quantized_vectors.rs (1)
raw_scorer(166-180)lib/common/common/src/fixed_length_priority_queue.rs (1)
top(80-82)
🔇 Additional comments (12)
lib/segment/src/index/hnsw_index/gpu/gpu_vector_storage/tests.rs (2)
696-698: Updated interface aligns with refactored RawScorer.The call to
raw_scorernow omits thepoint_deletedparameter, reflecting the architectural change where filtering logic has been moved fromRawScorertoFilteredScorer. This simplifies the scoring interface.
699-701: Interface update correctly implements the filtering logic refactor.The removal of the
point_deletedparameter fromnew_raw_scorer_for_testis consistent with the PR's objective of moving filtering responsibility fromRawScorertoFilteredScorer. This change simplifies scoring component interfaces.lib/segment/benches/vector_search.rs (3)
15-15: Updated imports to align with the refactored scorer architectureThe changes reflect the refactoring where
FilteredScorernow handles both scoring and filtering logic, replacing the previous direct import and usage ofnew_raw_scorer_for_test.Also applies to: 18-18
69-73: Direct usage of FilteredScorer simplifies benchmark codeThe benchmark now uses
FilteredScorer::new_for_testdirectly instead ofnew_raw_scorer_for_test, which aligns with the refactored architecture where filtering is integrated into the scorer component.
92-96: Simplified scorer construction in random access benchmarkConsistent change to use
FilteredScorer::new_for_testdirectly, making the code more straightforward by removing the unwrapping that was previously needed when usingnew_raw_scorer_for_test.lib/segment/src/index/plain_vector_index.rs (3)
9-9: Updated imports to align with refactored scorer architectureThe changes reflect the refactoring where
FilteredScorernow handles both scoring and filtering logic, replacing the previous direct usage ofnew_raw_scorer.Also applies to: 22-22
113-120: FilteredScorer now handles both scoring and filtering in a unified interfaceThe
FilteredScorer::newconstructor now takes additional parameters for filter context that were previously handled separately. This consolidation simplifies the search logic and aligns with the PR objective of moving filtering logic fromRawScorertoFilteredScorer.
141-148: Consistent use of FilteredScorer for unfiltered searchesThe same pattern is applied for unfiltered searches, using
Nonefor filter-related parameters. This unified approach maintains consistency and simplifies the codebase by having a single entry point for scorer creation.lib/segment/benches/scorer_mmap.rs (2)
13-13: Updated imports to align with refactored scorer architectureThe changes reflect the refactoring where
FilteredScorernow handles both scoring and filtering logic, replacing the previous direct import and usage ofnew_raw_scorer_for_test.Also applies to: 16-16
66-70: Direct usage of FilteredScorer simplifies mmap benchmark codeThe benchmark now uses
FilteredScorer::new_for_testdirectly instead ofnew_raw_scorer_for_test, which aligns with the refactored architecture where filtering is integrated into the scorer component.lib/segment/benches/fixture.rs (2)
8-8: Updated imports to align with refactored scorer architectureThe imports are simplified by removing
FakeFilterContextandFilteredScorer, as these are now handled internally byTestRawScorerProducer.
63-63: Simplified scorer acquisition in graph buildingThe code now directly gets a scorer from the vector holder instead of the previous two-step process (creating a raw scorer and then wrapping it with a filtered scorer). This simplification makes the code more readable and aligns with the PR's goal of consolidating filter and scorer functionality.
There was a problem hiding this comment.
Pull Request Overview
This PR refactors the scoring and filtering logic by moving several filtering responsibilities out of RawScorer and into FilteredScorer, resulting in a more streamlined scorer API and improved performance in certain code paths. Key changes include replacing new_raw_scorer calls with FilteredScorer::new (and its variants), updating methods such as peek_top_iter/peek_top_all, and adjusting tests and benchmarks to use the new API.
Reviewed Changes
Copilot reviewed 31 out of 31 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| lib/segment/src/index/sparse_index/sparse_vector_index.rs | Updated scorer construction and method calls to use FilteredScorer instead of RawScorer. |
| lib/segment/src/index/plain_vector_index.rs | Replaced new_raw_scorer with FilteredScorer::new and removed unused filter parameters. |
| lib/segment/src/index/hnsw_index/* | Across multiple files (tests, hnsw.rs, graph_layers, gpu modules), updated scorer construction and related API calls to use the new FilteredScorer interface. |
| lib/segment/src/index/hnsw_index/point_scorer.rs | Modified FilteredScorer to internally filter point lists and return an iterator of scored points rather than a slice. |
| Various benchmarks and fixture files | Adjusted scorer invocations to match the new API, removing legacy FakeFilterContext usage. |
Current version
Currently, we have the following traits/structs:
RawScorerfilters points by deletion flags, then scores them.FilteredScorerfilters points byFilterContext, then pass them toRawScorerto score.Some methods of
FilteredScorersimply call the corresponding methods ofRawScorer.Both
RawScorerandFilteredScorerfilter points, and both of them perform the scoring.In this PR
In this PR, I moved a lot of logic from
RawScorertoFilteredScorer.RawScorer::score_points()doesn't filter points anymore.point_deleted,vec_deletedbitslices are moved toFilteredScorer.Consequence:
FilteredScorer::check_vector()now should be a bit faster.RawScorer.FilterContext, not after.peek_top_iter()andpeek_top_all()methods are moved toFilteredScorer.RawScorerupdated to useFilteredScorerinstead.Removed temporary bufferFilteredScorer::points_bufferNot in this version of PR.
What was in this PR previously?
Minor change: this PR changes the interface of the scorer and the filterer to avoid the need for a temporary buffer.Previously:
PointOffsetTypeand passes it toFilteredScorer.FilteredScorerVec<ScoredPointOffset>(reusable, stored inFilteredScorer)RawScorer::score_pointswith both of these vectors.RawScorer::score_pointsalso performs its own filtering. It returns a number of scored points.FilteredScorerreturns buffer slice truncated by this number to the caller.In this PR:
ScoredPointOffset.Only the
.idxfield should be set; the.scorefield should be set to an arbitrary meaningless value.PointsFiltererfilters this vector in-place usingVec::retain().RawScorer::score_pointsfills the.scorevalues in-place.Why it is no longer in this PR?
In the current version of the PR,
RawScorerImpl::score_pointscallsQueryScorer::score_stored_batch, which accepts two separate slices.Doing the same for
QueryScorerwould be complicated (but possible I guess).Illustrated: if you replace
ids: &[PointOffsetType], scores: &mut [ScoreType]withids_with_scores: &mut [ScoredPointOffset], then you can't just passidstoget_dense_batchin the following code:qdrant/lib/segment/src/vector_storage/query_scorer/metric_query_scorer.rs
Lines 77 to 81 in 23b5f35
Performance
BFB on small (32 element) vectors: 3612 RPS -> 4721 RPS.
No significant changes on 1.5k vectors.
https://github.com/qdrant/vector-db-benchmark/actions/runs/14606832003
Criterion benchmark results
A lot of benchmarks got a huge boost, but the benchmark logic it's not exactly the same, so take it with a grain of salt:
TestRawScorerProducerno longer useFakeFilterContext(so one less dyn call inFilteredScorerfor them).FilteredScorer::score_points()now returns an iterator, not a slice.Results
hnsw_build_graph:hnsw_build_asymptotic:hnsw_search_graph:Performance on glove-100
Made 30k unique batch search requests with 10 searches in each. Times for each batch are in ms.
No filter:
With no-op filter:
Request for no filter:
Request for no-op filter:
Why
The new interface would be useful in incremental HNSW build as we would need to search without filtering.
Filter order
Another potential improvement that this PR would allow described below. Suppose we perform an unfiltered search within a graph that has additional links. See this code: https://github.com/qdrant/qdrant/blob/7abc684361fb81d8b62cf1554d8bf4fb65a706d7/lib/segment/src/index/hnsw_index/graph_layers.rs#L73-L80 This logic here is NOT: "use the first `m` neighbors aka main links"; Instead, the implemented logic is: "use the first `m` non-visited neighbors, i.e. take additional links if the main ones have already been visited". It seems to me that the implemented logic imposes additional unnecessary work. I believe that we should flip filters like this (pseudo code):