Move filtering logic from RawScorer to FilteredScorer by xzfc · Pull Request #6245 · qdrant/qdrant

xzfc · 2025-03-24T19:53:14Z

Current version

Currently, we have the following traits/structs:

RawScorer filters points by deletion flags, then scores them.
FilteredScorer filters points by FilterContext, then pass them to RawScorer to score.
Some methods of FilteredScorer simply call the corresponding methods of RawScorer.

Both RawScorer and FilteredScorer filter points, and both of them perform the scoring.

In this PR

In this PR, I moved a lot of logic from RawScorer to FilteredScorer.

RawScorer::score_points() doesn't filter points anymore.
point_deleted, vec_deleted bitslices are moved to FilteredScorer.
Consequence: FilteredScorer::check_vector() now should be a bit faster.
- It no longer performs a dyn trait call to RawScorer.
- Now these bitslices are checked before checking FilterContext, not after.
peek_top_iter() and peek_top_all() methods are moved to FilteredScorer.
A lot of tests that use RawScorer updated to use FilteredScorer instead.

Removed temporary buffer `FilteredScorer::points_buffer`

Not in this version of PR.

What was in this PR previously?

Minor change: this PR changes the interface of the scorer and the filterer to avoid the need for a temporary buffer.
Previously:

Caller prepares a vector of PointOffsetType and passes it to FilteredScorer.

FilteredScorer

filters in-place this vector (destroying the original values)

allocates a buffer of type Vec<ScoredPointOffset> (reusable, stored in FilteredScorer)

calls RawScorer::score_points with both of these vectors.

RawScorer::score_points also performs its own filtering. It returns a number of scored points.

FilteredScorer returns buffer slice truncated by this number to the caller.

In this PR:

Caller prepares a vector of ScoredPointOffset.
Only the .idx field should be set; the .score field should be set to an arbitrary meaningless value.

PointsFilterer filters this vector in-place using Vec::retain().

RawScorer::score_points fills the .score values in-place.

Why it is no longer in this PR?

In the current version of the PR, RawScorerImpl::score_points calls QueryScorer::score_stored_batch, which accepts two separate slices.

Doing the same for QueryScorer would be complicated (but possible I guess).
Illustrated: if you replace ids: &[PointOffsetType], scores: &mut [ScoreType] with ids_with_scores: &mut [ScoredPointOffset], then you can't just pass ids to get_dense_batch in the following code:

qdrant/lib/segment/src/vector_storage/query_scorer/metric_query_scorer.rs

Lines 77 to 81 in 23b5f35

    
           let mut vectors = [MaybeUninit::uninit(); VECTOR_READ_BATCH_SIZE]; 
        
           let vectors = self 
        
               .vector_storage 
        
               .get_dense_batch(ids, &mut vectors[..ids.len()]);

Performance

BFB on small (32 element) vectors: 3612 RPS -> 4721 RPS.
No significant changes on 1.5k vectors.

https://github.com/qdrant/vector-db-benchmark/actions/runs/14606832003

Criterion benchmark results

A lot of benchmarks got a huge boost, but the benchmark logic it's not exactly the same, so take it with a grain of salt:

Benchmarks that use TestRawScorerProducer no longer use FakeFilterContext (so one less dyn call in FilteredScorer for them).
FilteredScorer::score_points() now returns an iterator, not a slice.

Results

hnsw_build_graph:

Benchmarking hnsw-index-build-group/hnsw_index: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 9.2s.
hnsw-index-build-group/hnsw_index
                        time:   [915.87 ms 920.81 ms 926.33 ms]
                        change: [-3.5310% -2.8506% -2.0601%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 10 measurements (20.00%)
  2 (20.00%) high mild

hnsw_build_asymptotic:

hnsw-index-build-asymptotic/build-n-search-hnsw-5k
                        time:   [43.190 µs 43.231 µs 43.275 µs]
                        change: [-5.8648% -5.5461% -5.1992%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
hnsw-index-build-asymptotic/build-n-search-hnsw-1M
                        time:   [143.51 µs 143.76 µs 144.02 µs]
                        change: [-10.849% -10.650% -10.448%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
hnsw-index-build-asymptotic/build-n-search-hnsw-1M-score-point
                        time:   [68.044 µs 68.103 µs 68.162 µs]
                        change: [-18.860% -18.676% -18.505%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

scoring-vector/score-point
                        time:   [19.259 µs 19.275 µs 19.291 µs]
                        change: [-23.060% -22.902% -22.749%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe
scoring-vector/score-point-10x
                        time:   [19.414 µs 19.448 µs 19.485 µs]
                        change: [-25.670% -25.517% -25.364%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  6 (6.00%) high mild
scoring-vector/score-point-50x
                        time:   [55.111 µs 55.168 µs 55.227 µs]
                        change: [-23.943% -23.653% -23.373%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

scoring-vector/basic-score-point
                        time:   [144.57 µs 144.70 µs 144.82 µs]
                        change: [-0.8233% -0.6519% -0.4308%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe
scoring-vector/basic-score-point-10x
                        time:   [155.73 µs 155.89 µs 156.04 µs]
                        change: [-1.2111% -0.9589% -0.7167%] (p = 0.00 < 0.05)
                        Change within noise threshold.

hnsw_search_graph:

hnsw-search-graph/uncompressed
                        time:   [357.41 µs 358.14 µs 358.86 µs]
                        change: [-17.125% -16.858% -16.586%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) low mild
hnsw-search-graph/compressed
                        time:   [360.08 µs 360.91 µs 361.75 µs]
                        change: [-16.297% -16.001% -15.667%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
hnsw-search-graph/plain time:   [19.076 ms 19.113 ms 19.156 ms]
                        change: [-5.7350% -5.5325% -5.3217%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

Performance on glove-100

Made 30k unique batch search requests with 10 searches in each. Times for each batch are in ms.

No filter:

	before	after
min	10.017	8.731
max	20.559	22.528
mean	12.683	11.020
median	12.380	10.792
p95	15.680	13.445
p99	17.600	15.077

With no-op filter:

	before	after
min	10.724	9.445
max	20.958	22.160
mean	13.595	11.986
median	13.267	11.700
p95	16.967	14.755
p99	19.192	16.051

Request for no filter:

POST collections/{collection}/points/search/batch
{
  "searches": [
    {"vector": <vector>, "limit": 10 },
    … (9 more) …
  ]
}

Request for no-op filter:

POST collections/{collection}/points/search/batch
{
  "searches": [
    {"vector": <vector>, "limit": 10, "filter": {"must_not": {"has_id": [1]}} },
    … (9 more) …
  ]
}

Why

The new interface would be useful in incremental HNSW build as we would need to search without filtering.

Filter order

Another potential improvement that this PR would allow described below. Suppose we perform an unfiltered search within a graph that has additional links. See this code: https://github.com/qdrant/qdrant/blob/7abc684361fb81d8b62cf1554d8bf4fb65a706d7/lib/segment/src/index/hnsw_index/graph_layers.rs#L73-L80 This logic here is NOT: "use the first `m` neighbors aka main links"; Instead, the implemented logic is: "use the first `m` non-visited neighbors, i.e. take additional links if the main ones have already been visited". It seems to me that the implemented logic imposes additional unnecessary work. I believe that we should flip filters like this (pseudo code):

// old
point_ids.filter(is_not_visited).filter(is_not_deleted).take(m)
// new
point_ids.filter(is_not_deleted).take(m).filter(is_not_visited)

coderabbitai · 2025-03-24T19:58:45Z

📝 Walkthrough

Walkthrough

This change set performs a comprehensive refactor of scoring and filtering logic throughout the vector search segment, focusing on the consolidation and simplification of scorer construction and usage. The core change is the redesign of the FilteredScorer abstraction: it now owns its raw scorer, manages both point and vector deletion bit slices, and encapsulates filter context handling via a new BoxCow enum for flexible ownership. The FilteredScorer constructor is expanded to handle quantized and non-quantized vectors, filter contexts, and deletion masks in a unified manner, replacing scattered logic and explicit filter context wrappers.

All usages of FakeFilterContext and manual wrapping of raw scorers with filter contexts are removed across benchmarks, tests, and production code. Scorer acquisition is streamlined—benchmarks, tests, and index implementations now directly obtain a scorer via a single method or constructor. The RawScorer trait and its implementations are simplified, with the removal of deleted-point filtering logic and the refactoring of scoring methods to operate on pre-filtered points. Related changes propagate to quantized scorer builders, vector storage, and index modules, ensuring consistent scorer usage and interface signatures.

Suggested reviewers

timvisee
generall

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e2bce5a and 45a9cc3.

📒 Files selected for processing (1)

lib/segment/src/vector_storage/async_raw_scorer.rs (9 hunks)

⏰ Context from checks skipped due to timeout of 90000ms (13)

GitHub Check: Basic TLS/HTTPS tests
GitHub Check: test-snapshot-operations-s3-minio
GitHub Check: test-low-resources
GitHub Check: test-shard-snapshot-api-s3-minio
GitHub Check: test-consistency
GitHub Check: rust-tests (macos-latest)
GitHub Check: test-consensus-compose
GitHub Check: integration-tests
GitHub Check: rust-tests (windows-latest)
GitHub Check: lint
GitHub Check: storage-compat-test
GitHub Check: integration-tests-consensus
GitHub Check: rust-tests (ubuntu-latest)

🔇 Additional comments (6)

lib/segment/src/vector_storage/async_raw_scorer.rs (6)

8-8: Importing error handling types

Good addition of OperationError and OperationResult imports, which aligns with proper error handling in the refactored code.

24-24: Simplified function call

The constructor call has been simplified by removing deleted points and vectors parameters, which matches the PR objective of moving filtering logic to FilteredScorer.

36-41: Simplified AsyncRawScorerImpl constructor

The constructor has been refactored to only require query_scorer and storage, without deleted points bitslices. This aligns with the PR objective of moving filtering responsibility to FilteredScorer.

88-88: Simplified distance retrieval

Good simplification of getting the distance from storage.

121-121: Improved code reuse with helper function

Good refactoring to use the new helper function async_raw_scorer_from_query_scorer across all query vector types, which reduces code duplication and improves maintainability.

Also applies to: 140-140, 149-149, 159-159, 168-168

174-185: Well-designed helper function

The new helper function centralizes the creation of AsyncRawScorerImpl instances, making the code more maintainable and following the DRY principle. The function has appropriate type constraints and lifetime annotations.

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Caution

Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.

Actionable comments posted: 1

🧹 Nitpick comments (6)

lib/segment/src/index/hnsw_index/tests/mod.rs (1)
41-44: Refactored to separate scoring and filtering

The code now retrieves both a raw scorer and a filterer using get_scorer_and_filterer, replacing the previous approach with FakeFilterContext and FilteredScorer. This change is consistent with the PR's objective of decoupling scoring and filtering mechanisms.

The clone() on added_vector might be unnecessary since the vector is already passed by value. Consider reviewing if this clone operation is required:
-        let (raw_scorer, filterer) = vector_holder
-            .get_scorer_and_filterer(added_vector.clone())
-            .unwrap();
+        let (raw_scorer, filterer) = vector_holder
+            .get_scorer_and_filterer(added_vector)
+            .unwrap();
lib/segment/src/vector_storage/dense/memmap_dense_vector_storage.rs (1)
677-681: Improved result vector preparation with new_unset constructor

The implementation now uses a more direct approach to prepare the result vector by mapping over query points and using the new ScoredPointOffset::new_unset() constructor. This eliminates the need for vector resizing and makes the code more readable and efficient.

Consider adding a comment explaining that using new_unset() initializes the score with a placeholder value (0.0) that will be updated by score_points() to help future developers understand the flow.
  let mut res = query_points
      .iter()
      .map(|&idx| ScoredPointOffset::new_unset(idx))
      .collect::<Vec<_>>();
+ // The scores are initialized with placeholder values and will be updated by score_points
  scorer.score_points(&mut res);
lib/segment/src/vector_storage/async_raw_scorer.rs (1)

73-88: Refined score_points API
Now score_points directly updates the scores array in place, simplifying the interface. The early return on is_stopped is good for responsiveness. Using Cell for in-place mutation in the closure is well-implemented.

Consider providing a fallback strategy if io_uring is not supported, e.g. logging a warning or using the synchronous path, to avoid unexpected panic in production.
lib/segment/src/index/hnsw_index/point_filterer.rs (1)
58-69: Fix spelling in documentation.

The words performace and indended are misspelled in these lines.

Please apply this diff to correct them:
-/// This method operates on [`ScoredPointOffset`] for performace reasons.
-/// The indended usage is to prepare a vector of [`ScoredPointOffset`]
+/// This method operates on [`ScoredPointOffset`] for performance reasons.
+/// The intended usage is to prepare a vector of [`ScoredPointOffset`]
🧰 Tools

🪛 GitHub Check: Check for spelling errors

[failure] 61-61:
performace ==> performance

[failure] 62-62:
indended ==> intended, indented

🪛 GitHub Actions: Codespell

[warning] 61-61: codespell: 'performace' is a misspelling. Suggested correction: 'performance'.
lib/segment/benches/hnsw_build_asymptotic.rs (1)
35-35: Handle potential panic on .unwrap().

Each of these lines calls .unwrap(), which will panic if get_scorer_and_filterer returns an error. While it may be acceptable in benchmarking code, using .expect("<message>") or an error-handling strategy can provide clearer diagnostics when failures occur.
- let (raw_scorer, filterer) = vector_holder.get_scorer_and_filterer(query).unwrap();
+ let (raw_scorer, filterer) = vector_holder
+     .get_scorer_and_filterer(query)
+     .expect("Failed to retrieve scorer and filterer");
Also applies to: 56-56, 63-63, 72-72, 80-80, 94-94, 111-111, 127-127, 143-143
lib/segment/src/index/hnsw_index/graph_layers.rs (1)
425-425: Avoid potential panic on .unwrap().

Using .unwrap() can cause unwanted panics if get_scorer_and_filterer fails. Consider a more descriptive .expect() or a proper error-handling approach to make failures more explicit.
- let (raw_scorer, filterer) = vector_holder.get_scorer_and_filterer(added_vector).unwrap();
+ let (raw_scorer, filterer) = vector_holder
+     .get_scorer_and_filterer(added_vector)
+     .expect("Failed to retrieve scorer and filterer");

🛑 Comments failed to post (1)

lib/segment/benches/hnsw_build_graph.rs (1)
31-33: 💡 Verification agent
🧩 Analysis chain

Refactored to separate scoring and filtering

The code now retrieves both a raw scorer and a filterer directly from the vector holder, replacing the previous approach of using FakeFilterContext and FilteredScorer. This change aligns with the PR's objective of decoupling scoring and filtering mechanisms.

Consider handling potential errors from get_scorer_and_filterer more gracefully instead of using unwrap(), which could cause panics in case of failure:

🏁 Script executed:
#!/bin/bash
# Search for other instances of error handling with get_scorer_and_filterer
rg "get_scorer_and_filterer.*unwrap" --type rust
Length of output: 2698
Enhance Error Handling for get_scorer_and_filterer

The refactor effectively decouples scoring and filtering by retrieving both components directly from the vector holder. However, our search indicates that using unwrap() with get_scorer_and_filterer is pervasive across the codebase (including in this file, e.g., in lib/segment/benches/hnsw_build_graph.rs on lines 31–33, and elsewhere). This approach could lead to panics if the function ever fails.

Action Items:

Review and refactor error handling for get_scorer_and_filterer both here and in other affected files (e.g., in lib/segment/src/index/hnsw_index/graph_layers.rs and lib/segment/src/index/hnsw_index/graph_layers_builder.rs).

Consider returning more informative errors or using methods like expect() with a clear message, or refactor to propagate errors gracefully.

lib/segment/src/vector_storage/async_raw_scorer.rs

generall · 2025-03-25T10:20:33Z

lib/segment/src/index/hnsw_index/point_filterer.rs

+            Some(f) => scores
+                .retain(|score| self.simple_filterer.check_vector(score.idx) && f.check(score.idx)),
+        }
+        if limit != 0 {


Why this condition is needed?

Shall we put it at the beginning of the function body without the condition?

Relevant code in the current version of this PR:

qdrant/lib/segment/src/index/hnsw_index/point_scorer.rs

Lines 133 to 141 in e2bce5a

pub fn score_points(

&mut self,

point_ids: &mut Vec<PointOffsetType>,

limit: usize,

) -> impl Iterator<Item = ScoredPointOffset> {

point_ids.retain(|point_id| self.check_vector(*point_id));

if limit != 0 {

point_ids.truncate(limit);

}

limit == 0 means "don't truncate scores". I also don't like, but this logic was there before this PR. I've added a comment in FilteredScorer::score_points about this.

(maybe limit: usize can be replaced with Option<NonZeroUsize>, but I don't want more bloat in this PR)

Shall we put it at the beginning of the function body without the condition?

No, retain-then-retain may produce a different result than truncate-then-retain.

lib/segment/src/vector_storage/raw_scorer.rs

generall · 2025-03-25T13:00:36Z

After review in-call:

Try to merge Filterer and RawScorer in the following way:

FIlteredScorer uses RawScorer to combine into a single call

            filterer.filter_scores(&mut scores, limit);
            scorer.score_points(&mut scores);

If RawScorer is needed without filters, it can be extracted (used from) FilteredScorer

is_stopped flag should be moved FilteredScorer
Move peek_top_iter and peek_top_all into FilteredScorer

For a separate PR:

Try to use query_scorer.score_stored_batch in score_points of RawScorerImpl

Copilot

Copilot reviewed 23 out of 23 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

lib/segment/src/vector_storage/raw_scorer.rs:941

[nitpick] Consider renaming this method (or updating its documentation) to emphasize that it fills an existing scores buffer in-place instead of returning a new vector; this would clarify that the scores slice must be pre-initialized (e.g. using ScoredPointOffset::new_unset).

fn score_points(&self, scores: &mut [ScoredPointOffset]) {

lib/segment/src/fixtures/index_fixtures.rs:170

[nitpick] Consider refactoring the repeated pattern of obtaining a RawScorer and its corresponding PointsFilterer into a dedicated helper function or utility to reduce boilerplate at call sites.

pub fn get_scorer_and_filterer(

lib/common/common/src/types.rs

lib/segment/src/index/hnsw_index/point_filterer.rs

timvisee · 2025-04-09T09:09:21Z

lib/segment/src/index/hnsw_index/point_filterer.rs

+            Some(f) => scores
+                .retain(|score| self.simple_filterer.check_vector(score.idx) && f.check(score.idx)),
+        }
+        if limit != 0 {


Shall we put it at the beginning of the function body without the condition?

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (6)

lib/common/common/src/iterator_ext/mod.rs (1)
47-55: Well-implemented benchmarking utility!

The black_box method is a good addition for benchmarking purposes. It correctly applies std::hint::black_box to each item in the iterator to prevent compiler optimizations during benchmarks.

If you prefer a more concise style, the closure could be simplified to:
-    self.for_each(|p| {
-        std::hint::black_box(p);
-    });
+    self.for_each(std::hint::black_box);
However, the current implementation is explicit and clear, so this is purely optional.
lib/segment/src/index/sparse_index/sparse_vector_index.rs (2)
302-309: Leverage quantized vectors if present for cheaper scoring

FilteredScorer::new accepts an optional quantized_vectors parameter, but we always pass None.
When the underlying storage has a pre‑built quantized representation, using it typically reduces RAM traffic and SIMD cost during scoring.
-        let scorer = FilteredScorer::new(
-            query_vector.clone(),
-            &vector_storage,
-            None,
-            None,
-            deleted_point_bitslice,
-            vector_query_context.hardware_counter(),
-        )?;
+        let scorer = FilteredScorer::new(
+            query_vector.clone(),
+            &vector_storage,
+            vector_storage.quantized_vectors(),   // <- returns `Option<&QuantizedVectors>`
+            None,
+            deleted_point_bitslice,
+            vector_query_context.hardware_counter(),
+        )?;
(If quantized_vectors() is not available on VectorStorageEnum, you can expose it behind a helper.)
This is a pure optimisation; functional behaviour remains identical.

318-323: Passing the iterator by reference is unnecessary

peek_top_iter takes its iterator by value, and &mut I implements Iterator only indirectly.
Simply move the iterator to the call‑site for slightly cleaner code and avoid the extra reference indirection:
-                let res = scorer.peek_top_iter(&mut filtered_points, top, &is_stopped)?;
+                let res = scorer.peek_top_iter(filtered_points, top, &is_stopped)?;
lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (1)
52-83: Heavy match‑cascade could be centralised

The triple‑nested match on (datatype, distance) duplicates very similar branches 18 times.
Consider a helper such as
fn route<TElement, TMetric>(self, builder: fn(Self) -> OperationResult<Box<dyn RawScorer<'a>>>)
or a small macro to collapse repetition.
Less duplication eases future distance additions (e.g. CosineSquared) and reduces the chance of forgetting one of the 18 arms.
lib/segment/src/index/hnsw_index/point_scorer.rs (2)
143-152: Slice scores_buffer to avoid scanning unused entries

zip(&*point_ids, &self.scores_buffer) iterates the entire buffer, although only the first point_ids.len() slots are freshly written.
Use the same slice you gave to score_points:
-        std::iter::zip(&*point_ids, &self.scores_buffer)
+        std::iter::zip(&*point_ids, &self.scores_buffer[..point_ids.len()])
Saves needless cache traffic when the buffer had previously grown large.

147-152: scores_buffer never shrinks – potential long‑lived memory spike

The buffer only ever grows; if one request with 1 M points arrives, every subsequent tiny request will still carry a 1 M‑entry buffer.

Consider resetting it when point_ids.len() falls under a fraction of current capacity, or allocate fresh per call and reuse via thread‑local pool.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0586a80 and e2bce5a.

📒 Files selected for processing (24)

lib/common/common/src/iterator_ext/mod.rs (1 hunks)
lib/segment/benches/fixture.rs (2 hunks)
lib/segment/benches/hnsw_build_asymptotic.rs (7 hunks)
lib/segment/benches/hnsw_build_graph.rs (2 hunks)
lib/segment/benches/hnsw_search_graph.rs (4 hunks)
lib/segment/benches/scorer_mmap.rs (2 hunks)
lib/segment/benches/vector_search.rs (3 hunks)
lib/segment/src/fixtures/index_fixtures.rs (2 hunks)
lib/segment/src/index/hnsw_index/gpu/gpu_graph_builder.rs (3 hunks)
lib/segment/src/index/hnsw_index/gpu/gpu_insert_context.rs (5 hunks)
lib/segment/src/index/hnsw_index/gpu/gpu_vector_storage/tests.rs (1 hunks)
lib/segment/src/index/hnsw_index/gpu/mod.rs (3 hunks)
lib/segment/src/index/hnsw_index/graph_layers.rs (6 hunks)
lib/segment/src/index/hnsw_index/graph_layers_builder.rs (7 hunks)
lib/segment/src/index/hnsw_index/hnsw.rs (13 hunks)
lib/segment/src/index/hnsw_index/point_scorer.rs (2 hunks)
lib/segment/src/index/hnsw_index/tests/mod.rs (2 hunks)
lib/segment/src/index/hnsw_index/tests/test_compact_graph_layer.rs (2 hunks)
lib/segment/src/index/plain_vector_index.rs (4 hunks)
lib/segment/src/index/sparse_index/sparse_vector_index.rs (4 hunks)
lib/segment/src/vector_storage/async_raw_scorer.rs (9 hunks)
lib/segment/src/vector_storage/dense/memmap_dense_vector_storage.rs (9 hunks)
lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (10 hunks)
lib/segment/src/vector_storage/quantized/quantized_vectors.rs (0 hunks)

💤 Files with no reviewable changes (1)

lib/segment/src/vector_storage/quantized/quantized_vectors.rs

✅ Files skipped from review due to trivial changes (3)

lib/segment/src/index/hnsw_index/tests/test_compact_graph_layer.rs
lib/segment/src/index/hnsw_index/graph_layers.rs
lib/segment/src/index/hnsw_index/graph_layers_builder.rs

🚧 Files skipped from review as they are similar to previous changes (10)

lib/segment/src/index/hnsw_index/gpu/mod.rs
lib/segment/benches/hnsw_build_graph.rs
lib/segment/src/index/hnsw_index/gpu/gpu_insert_context.rs
lib/segment/src/index/hnsw_index/tests/mod.rs
lib/segment/src/vector_storage/dense/memmap_dense_vector_storage.rs
lib/segment/benches/hnsw_search_graph.rs
lib/segment/src/fixtures/index_fixtures.rs
lib/segment/benches/hnsw_build_asymptotic.rs
lib/segment/src/index/hnsw_index/gpu/gpu_graph_builder.rs
lib/segment/src/index/hnsw_index/hnsw.rs

🧰 Additional context used

🧬 Code Graph Analysis (3)

lib/segment/benches/vector_search.rs (2)

lib/segment/src/vector_storage/dense/simple_dense_vector_storage.rs (1)

open_simple_dense_vector_storage (92-108)

lib/segment/src/index/hnsw_index/point_scorer.rs (1)

new_for_test (99-111)

lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (1)

lib/segment/src/vector_storage/raw_scorer.rs (1)

raw_scorer_from_query_scorer (423-434)

lib/segment/src/index/hnsw_index/point_scorer.rs (4)

lib/segment/src/common/operation_error.rs (1)

check_process_stopped (242-247)

lib/segment/src/vector_storage/raw_scorer.rs (2)

check_deleted_condition (736-747)

new_raw_scorer (51-121)

lib/segment/src/vector_storage/quantized/quantized_vectors.rs (1)

raw_scorer (166-180)

lib/common/common/src/fixed_length_priority_queue.rs (1)

top (80-82)

🔇 Additional comments (12)

lib/segment/src/index/hnsw_index/gpu/gpu_vector_storage/tests.rs (2)

696-698: Updated interface aligns with refactored RawScorer.

The call to raw_scorer now omits the point_deleted parameter, reflecting the architectural change where filtering logic has been moved from RawScorer to FilteredScorer. This simplifies the scoring interface.

699-701: Interface update correctly implements the filtering logic refactor.

The removal of the point_deleted parameter from new_raw_scorer_for_test is consistent with the PR's objective of moving filtering responsibility from RawScorer to FilteredScorer. This change simplifies scoring component interfaces.

lib/segment/benches/vector_search.rs (3)

15-15: Updated imports to align with the refactored scorer architecture

The changes reflect the refactoring where FilteredScorer now handles both scoring and filtering logic, replacing the previous direct import and usage of new_raw_scorer_for_test.

Also applies to: 18-18

69-73: Direct usage of FilteredScorer simplifies benchmark code

The benchmark now uses FilteredScorer::new_for_test directly instead of new_raw_scorer_for_test, which aligns with the refactored architecture where filtering is integrated into the scorer component.

92-96: Simplified scorer construction in random access benchmark

Consistent change to use FilteredScorer::new_for_test directly, making the code more straightforward by removing the unwrapping that was previously needed when using new_raw_scorer_for_test.

lib/segment/src/index/plain_vector_index.rs (3)

9-9: Updated imports to align with refactored scorer architecture

The changes reflect the refactoring where FilteredScorer now handles both scoring and filtering logic, replacing the previous direct usage of new_raw_scorer.

Also applies to: 22-22

113-120: FilteredScorer now handles both scoring and filtering in a unified interface

The FilteredScorer::new constructor now takes additional parameters for filter context that were previously handled separately. This consolidation simplifies the search logic and aligns with the PR objective of moving filtering logic from RawScorer to FilteredScorer.

141-148: Consistent use of FilteredScorer for unfiltered searches

The same pattern is applied for unfiltered searches, using None for filter-related parameters. This unified approach maintains consistency and simplifies the codebase by having a single entry point for scorer creation.

lib/segment/benches/scorer_mmap.rs (2)

13-13: Updated imports to align with refactored scorer architecture

The changes reflect the refactoring where FilteredScorer now handles both scoring and filtering logic, replacing the previous direct import and usage of new_raw_scorer_for_test.

Also applies to: 16-16

66-70: Direct usage of FilteredScorer simplifies mmap benchmark code

The benchmark now uses FilteredScorer::new_for_test directly instead of new_raw_scorer_for_test, which aligns with the refactored architecture where filtering is integrated into the scorer component.

lib/segment/benches/fixture.rs (2)

8-8: Updated imports to align with refactored scorer architecture

The imports are simplified by removing FakeFilterContext and FilteredScorer, as these are now handled internally by TestRawScorerProducer.

63-63: Simplified scorer acquisition in graph building

The code now directly gets a scorer from the vector holder instead of the previous two-step process (creating a raw scorer and then wrapping it with a filtered scorer). This simplification makes the code more readable and aligns with the PR's goal of consolidating filter and scorer functionality.

lib/segment/src/vector_storage/async_raw_scorer.rs

lib/segment/src/index/hnsw_index/point_scorer.rs

Copilot

Pull Request Overview

This PR refactors the scoring and filtering logic by moving several filtering responsibilities out of RawScorer and into FilteredScorer, resulting in a more streamlined scorer API and improved performance in certain code paths. Key changes include replacing new_raw_scorer calls with FilteredScorer::new (and its variants), updating methods such as peek_top_iter/peek_top_all, and adjusting tests and benchmarks to use the new API.

Reviewed Changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
lib/segment/src/index/sparse_index/sparse_vector_index.rs	Updated scorer construction and method calls to use FilteredScorer instead of RawScorer.
lib/segment/src/index/plain_vector_index.rs	Replaced new_raw_scorer with FilteredScorer::new and removed unused filter parameters.
lib/segment/src/index/hnsw_index/*	Across multiple files (tests, hnsw.rs, graph_layers, gpu modules), updated scorer construction and related API calls to use the new FilteredScorer interface.
lib/segment/src/index/hnsw_index/point_scorer.rs	Modified FilteredScorer to internally filter point lists and return an iterator of scored points rather than a slice.
Various benchmarks and fixture files	Adjusted scorer invocations to match the new API, removing legacy FakeFilterContext usage.

xzfc requested review from generall and timvisee March 24, 2025 19:53

xzfc force-pushed the separate-scorer-filterer branch from 9bf697e to 0586a80 Compare March 24, 2025 19:55

coderabbitai bot reviewed Mar 24, 2025

View reviewed changes

coszio self-requested a review March 24, 2025 20:02

generall reviewed Mar 25, 2025

View reviewed changes

lib/segment/src/vector_storage/async_raw_scorer.rs Outdated Show resolved Hide resolved

generall reviewed Mar 25, 2025

View reviewed changes

lib/segment/src/vector_storage/async_raw_scorer.rs Outdated Show resolved Hide resolved

generall reviewed Mar 25, 2025

View reviewed changes

lib/segment/src/vector_storage/raw_scorer.rs Outdated Show resolved Hide resolved

generall reviewed Mar 25, 2025

View reviewed changes

lib/segment/src/vector_storage/raw_scorer.rs Outdated Show resolved Hide resolved

This was referenced Apr 2, 2025

Use MaybeUninit in vector storages and query scorers #6303

Merged

Remove is_stopped from RawScorer #6305

Merged

timvisee requested a review from Copilot April 9, 2025 09:20

Copilot AI reviewed Apr 9, 2025

View reviewed changes

timvisee reviewed Apr 9, 2025

View reviewed changes

xzfc marked this pull request as draft April 14, 2025 10:24

xzfc force-pushed the separate-scorer-filterer branch from 0586a80 to 972a598 Compare April 17, 2025 11:31

xzfc added 2 commits April 22, 2025 12:00

Move filtering logic from RawScorer to FilteredScorer

214d5a0

Fixups

e2bce5a

xzfc force-pushed the separate-scorer-filterer branch from 972a598 to e2bce5a Compare April 22, 2025 19:05

xzfc changed the title ~~Separate scorer and filterer~~ Move filtering logic from RawScorer to FilteredScorer Apr 22, 2025

xzfc marked this pull request as ready for review April 22, 2025 22:45

coderabbitai bot reviewed Apr 22, 2025

View reviewed changes

lib/segment/src/vector_storage/async_raw_scorer.rs Show resolved Hide resolved

lib/segment/src/index/hnsw_index/point_scorer.rs Show resolved Hide resolved

Restore comment

45a9cc3

xzfc requested a review from Copilot April 22, 2025 23:23

Copilot AI reviewed Apr 22, 2025

View reviewed changes

github-actions bot mentioned this pull request Apr 22, 2025

Flaky test multivector_filtrable_hnsw_test::test_multi_filterable_hnsw::case_5_recobestscore_eq #6318

Closed

generall approved these changes Apr 23, 2025

View reviewed changes

xzfc merged commit 44341bb into dev Apr 23, 2025
20 checks passed

xzfc deleted the separate-scorer-filterer branch April 23, 2025 18:23

generall pushed a commit that referenced this pull request May 22, 2025

Move filtering logic from RawScorer to FilteredScorer (#6245)

ce1adf4

This was referenced Jul 21, 2025

UUID index bug #6916

Merged

bump version to v1.15.1 #6934

Merged

QueryScorerBytes #6945

Merged

Update collection-level payload index schema after recovering partial snapshot #6886

Merged

This was referenced Jul 29, 2025

Bump dev version to 1.15.2-dev #6955

Merged

TestRawScorerProducer to use VectorStorageEnum #6965

Merged

coderabbitai bot mentioned this pull request Aug 18, 2025

Optional scorer methods via TBool #6989

Merged

coderabbitai bot mentioned this pull request Nov 14, 2025

Batched scans for dense vectors #7514

Merged

9 tasks

This was referenced Nov 21, 2025

Use batching searcher for sparse vector search #7572

Merged

Fix search in empty HNSW segments #7620

Merged

	let mut vectors = [MaybeUninit::uninit(); VECTOR_READ_BATCH_SIZE];

	let vectors = self
	.vector_storage
	.get_dense_batch(ids, &mut vectors[..ids.len()]);

	pub fn score_points(
	&mut self,
	point_ids: &mut Vec<PointOffsetType>,
	limit: usize,
	) -> impl Iterator<Item = ScoredPointOffset> {
	point_ids.retain(\|point_id\| self.check_vector(*point_id));
	if limit != 0 {
	point_ids.truncate(limit);
	}

Conversation

xzfc commented Mar 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Current version

In this PR

Removed temporary buffer FilteredScorer::points_buffer

Performance

Criterion benchmark results

Performance on glove-100

Why

Filter order

Uh oh!

coderabbitai bot commented Mar 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

generall Mar 25, 2025

Choose a reason for hiding this comment

Uh oh!

timvisee Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

xzfc Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

generall commented Mar 25, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

timvisee Apr 9, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

xzfc commented Mar 24, 2025 •

edited

Loading

Removed temporary buffer `FilteredScorer::points_buffer`

coderabbitai bot commented Mar 24, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)