Skip to content

QueryScorerBytes#6945

Merged
xzfc merged 4 commits intodevfrom
hnsw-links-FilteredQuantizedScorer
Jul 31, 2025
Merged

QueryScorerBytes#6945
xzfc merged 4 commits intodevfrom
hnsw-links-FilteredQuantizedScorer

Conversation

@xzfc
Copy link
Member

@xzfc xzfc commented Jul 28, 2025

This PR implements scorers required for the vectors-in-graph feature.

It introduces a new trait QueryScorerBytes that allows scoring encoded query against vector encoded as &[u8].

This new trait is integrated enough to provide two variations of FilteredScorer:

  • One (old one) is regular FilteredScorer that is used to score against PointOffsetType (retrieved from the vector storage). No changes in logic here.
  • Another (new one) is FilteredQuantizedScorer to score against vector encoded as &[u8]. Later these vectors would be stored alongside graph links.

Supported quantizations: binary, scalar, and product.
Supported only singular vectors, not multi-vectors.

@xzfc xzfc added this to the Graph with vectors milestone Jul 28, 2025
@xzfc xzfc requested review from IvanPleshkov and generall July 28, 2025 05:47
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jul 28, 2025

📝 Walkthrough

Walkthrough

This change introduces a new trait, EncodedVectorsBytes, which extends the existing EncodedVectors trait to enable scoring directly against raw byte slices. Implementations of this trait are provided for various quantized vector types, refactoring their scoring methods to delegate logic to the new score_point_vs_bytes method. A parallel trait, QueryScorerBytes, is also introduced to allow scoring from byte slices at the scorer level. The scorer builder and quantized vector modules are updated to construct and utilize these new byte-based scorers. Additional refactoring includes modularizing vector data parsing and offset handling, making the scorer infrastructure generic over scorer types, and exposing relevant modules and traits publicly.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

chore

Suggested reviewers

  • generall

Note

⚡️ Unit Test Generation is now available in beta!

Learn more here, or try it out under "Finishing Touches" below.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 37a207a and ff3a93e.

📒 Files selected for processing (4)
  • lib/segment/src/index/hnsw_index/point_scorer.rs (3 hunks)
  • lib/segment/src/vector_storage/mod.rs (1 hunks)
  • lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (6 hunks)
  • lib/segment/src/vector_storage/quantized/quantized_vectors.rs (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • lib/segment/src/vector_storage/mod.rs
  • lib/segment/src/vector_storage/quantized/quantized_vectors.rs
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: coszio
PR: qdrant/qdrant#6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec<T>, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::<T>() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: xzfc
PR: qdrant/qdrant#6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In the AsyncRawScorerImpl implementation, the unwrap() call on read_vectors_async results is intentional, with an explanatory comment noting that this experimental feature is meant to crash rather than silently fall back to synchronous implementation.
Learnt from: xzfc
PR: qdrant/qdrant#6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In AsyncRawScorerImpl, the unwrap() call after read_vectors_async is intentional. The io_uring feature is experimental, and the code is designed to panic rather than silently fall back to a synchronous implementation if it fails, directing users to use the default IO implementation instead.
lib/segment/src/index/hnsw_index/point_scorer.rs (5)

Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.

Learnt from: xzfc
PR: #6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In AsyncRawScorerImpl, the unwrap() call after read_vectors_async is intentional. The io_uring feature is experimental, and the code is designed to panic rather than silently fall back to a synchronous implementation if it fails, directing users to use the default IO implementation instead.

Learnt from: xzfc
PR: #6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In the AsyncRawScorerImpl implementation, the unwrap() call on read_vectors_async results is intentional, with an explanatory comment noting that this experimental feature is meant to crash rather than silently fall back to synchronous implementation.

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.

lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (4)

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:122-127
Timestamp: 2025-07-02T16:42:22.247Z
Learning: In lib/quantization/cpp/sse.c asymmetric binary quantization functions, the argument order in _mm_set_epi32 calls is intentionally designed to match query pointer offsets with corresponding bit shift factors. The "numbers" refer to logical indices that correspond to weighting factors (1, 2, 4, 8, etc. representing bit shifts), not SIMD register positions.

🧬 Code Graph Analysis (1)
lib/segment/src/index/hnsw_index/point_scorer.rs (3)
lib/segment/src/vector_storage/raw_scorer.rs (1)
  • check_deleted_condition (439-450)
lib/segment/src/data_types/query_context.rs (1)
  • hardware_counter (208-210)
lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs (1)
  • new (29-55)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: e2e-tests
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: test-consistency
  • GitHub Check: lint
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (macos-latest)
🔇 Additional comments (10)
lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (5)

2-2: LGTM! Import additions are appropriate.

The new imports support the byte-based scoring functionality correctly.

Also applies to: 7-7, 22-22


56-56: LGTM! Method rename improves clarity.

Renaming build to build_raw_scorer makes the method's purpose clearer and distinguishes it from the new byte-based scorer builder.


89-134: LGTM! Well-structured byte-based scorer builder.

The implementation follows the established pattern and correctly handles all supported vector datatypes and distance metrics.


181-214: LGTM! Proper handling of storage variants with documented multi-vector limitation.

The method correctly delegates to new_quantized_scorer_bytes for supported storage types and appropriately excludes multi-quantized storages, which aligns with the PR objectives.


287-356: LGTM! Consistent implementation for byte-based query scorers.

The method properly handles all query vector types and maintains consistency with the existing new_quantized_scorer pattern while correctly returning QueryScorerBytes trait objects.

lib/segment/src/index/hnsw_index/point_scorer.rs (5)

15-15: LGTM! Import addition is necessary.

The QueryScorerBytes import is required for the new quantized scorer functionality.


40-56: LGTM! Clean generic refactoring maintaining backward compatibility.

The type aliases preserve existing API while the generic FilteredScorerImpl enables support for both raw scorers and byte-based scorers.


58-68: LGTM! Appropriate method placement in generic implementation.

Moving check_vector to the generic implementation is logical since this filtering logic applies to all scorer types.


71-87: LGTM! Well-structured constructor for quantized scorers.

The new_quantized method follows established patterns and properly handles error cases through OperationResult.


106-302: LGTM! Raw scorer implementation preserved correctly.

The existing RawScorer functionality is maintained while adapting to the new generic structure, ensuring backward compatibility.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hnsw-links-FilteredQuantizedScorer

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
lib/quantization/src/encoded_vectors.rs (1)

55-62: Consider adding documentation for the new trait and method.

The EncodedVectorsBytes trait and its score_point_vs_bytes method lack documentation. Adding doc comments would help clarify the purpose and usage of this byte-based scoring interface.

Consider adding documentation like this:

+/// Extension trait for `EncodedVectors` that enables scoring directly against raw byte slices.
+/// This is useful for scoring quantized vectors stored as bytes without requiring indexed access.
 pub trait EncodedVectorsBytes: EncodedVectors {
+    /// Score a query against an encoded vector represented as a raw byte slice.
+    /// 
+    /// # Parameters
+    /// - `query`: The encoded query to score against
+    /// - `bytes`: Raw byte slice representing an encoded vector
+    /// - `hw_counter`: Hardware counter for performance tracking
+    /// 
+    /// # Returns
+    /// Similarity score between the query and the encoded vector
     fn score_point_vs_bytes(
lib/segment/src/vector_storage/query_scorer/mod.rs (1)

40-42: Consider adding documentation for the new trait.

The QueryScorerBytes trait lacks documentation. Adding doc comments would help clarify its purpose and relationship to the existing QueryScorer trait.

Consider adding documentation like this:

+/// Trait for scoring queries directly against raw byte slices representing encoded vectors.
+/// This complements the existing `QueryScorer` trait by enabling scoring on byte-encoded data
+/// without requiring indexed vector storage access.
 pub trait QueryScorerBytes {
+    /// Score a query against an encoded vector represented as a raw byte slice.
+    /// 
+    /// # Parameters
+    /// - `bytes`: Raw byte slice representing an encoded vector
+    /// 
+    /// # Returns
+    /// Similarity score between the query and the encoded vector
     fn score_bytes(&self, bytes: &[u8]) -> ScoreType;
lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (1)

181-214: Appropriate handling of multi-vector cases.

The method correctly returns an error for multi-vector storage types, which aligns with the PR objectives stating that multi-vector support is not included. Consider adding a TODO comment to track future multi-vector support.

Consider adding a comment to document the limitation:

            QuantizedVectorStorage::ScalarRamMulti(_)
            | QuantizedVectorStorage::ScalarMmapMulti(_)
            | QuantizedVectorStorage::PQRamMulti(_)
            | QuantizedVectorStorage::PQMmapMulti(_)
            | QuantizedVectorStorage::BinaryRamMulti(_)
-            | QuantizedVectorStorage::BinaryMmapMulti(_) => Err(OperationError::WrongMulti),
+            | QuantizedVectorStorage::BinaryMmapMulti(_) => {
+                // TODO: Add multi-vector support for QueryScorerBytes
+                Err(OperationError::WrongMulti)
+            }
lib/quantization/src/encoded_vectors_u8.rs (1)

495-568: Consider documenting the data size precondition.

The implementation is well-structured with proper CPU feature detection and hardware counter tracking. However, the debug_assert at line 506 only validates in debug builds.

Add a comment about the expected data format:

     ) -> f32 {
         hw_counter
             .cpu_counter()
             .incr_delta(self.metadata.vector_parameters.dim);
 
+        // Expects bytes to contain: [f32 offset][u8 vector data...]
         debug_assert!(bytes.len() >= std::mem::size_of::<f32>() + self.metadata.actual_dim);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9bd506d and 37a207a.

📒 Files selected for processing (12)
  • lib/quantization/src/encoded_vectors.rs (1 hunks)
  • lib/quantization/src/encoded_vectors_binary.rs (3 hunks)
  • lib/quantization/src/encoded_vectors_pq.rs (6 hunks)
  • lib/quantization/src/encoded_vectors_u8.rs (10 hunks)
  • lib/quantization/src/lib.rs (1 hunks)
  • lib/segment/src/index/hnsw_index/point_scorer.rs (3 hunks)
  • lib/segment/src/vector_storage/mod.rs (1 hunks)
  • lib/segment/src/vector_storage/quantized/quantized_custom_query_scorer.rs (2 hunks)
  • lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs (2 hunks)
  • lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (6 hunks)
  • lib/segment/src/vector_storage/quantized/quantized_vectors.rs (3 hunks)
  • lib/segment/src/vector_storage/query_scorer/mod.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (13)
📓 Common learnings
Learnt from: coszio
PR: qdrant/qdrant#6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec<T>, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::<T>() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: xzfc
PR: qdrant/qdrant#6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In the AsyncRawScorerImpl implementation, the unwrap() call on read_vectors_async results is intentional, with an explanatory comment noting that this experimental feature is meant to crash rather than silently fall back to synchronous implementation.
Learnt from: xzfc
PR: qdrant/qdrant#6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In AsyncRawScorerImpl, the unwrap() call after read_vectors_async is intentional. The io_uring feature is experimental, and the code is designed to panic rather than silently fall back to a synchronous implementation if it fails, directing users to use the default IO implementation instead.
lib/segment/src/vector_storage/mod.rs (2)

Learnt from: timvisee
PR: #6503
File: lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs:60-74
Timestamp: 2025-05-12T12:54:27.872Z
Learning: In the Qdrant codebase, using pub(super) visibility is preferred when fields need to be accessed by sibling modules (particularly for index type conversions), as it provides the necessary access without bloating the interface with numerous getters.

Learnt from: xzfc
PR: #6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In AsyncRawScorerImpl, the unwrap() call after read_vectors_async is intentional. The io_uring feature is experimental, and the code is designed to panic rather than silently fall back to a synchronous implementation if it fails, directing users to use the default IO implementation instead.

lib/quantization/src/encoded_vectors.rs (2)

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.

lib/segment/src/vector_storage/query_scorer/mod.rs (2)

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.

lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs (2)

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.

lib/quantization/src/lib.rs (2)

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Learnt from: timvisee
PR: #6503
File: lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs:60-74
Timestamp: 2025-05-12T12:54:27.872Z
Learning: In the Qdrant codebase, using pub(super) visibility is preferred when fields need to be accessed by sibling modules (particularly for index type conversions), as it provides the necessary access without bloating the interface with numerous getters.

lib/segment/src/vector_storage/quantized/quantized_custom_query_scorer.rs (2)

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.

lib/quantization/src/encoded_vectors_binary.rs (7)

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:122-127
Timestamp: 2025-07-02T16:42:22.247Z
Learning: In lib/quantization/cpp/sse.c asymmetric binary quantization functions, the argument order in _mm_set_epi32 calls is intentionally designed to match query pointer offsets with corresponding bit shift factors. The "numbers" refer to logical indices that correspond to weighting factors (1, 2, 4, 8, etc. representing bit shifts), not SIMD register positions.

Learnt from: coszio
PR: #6565
File: lib/posting_list/src/builder.rs:63-67
Timestamp: 2025-05-26T14:47:23.505Z
Learning: In the posting_list crate's PostingChunk struct, avoid adding extra fields like storing computed chunk_bits to prevent struct bloat and maintain mmap compatibility. The bit calculations from offsets are inexpensive compared to the memory and compatibility benefits of keeping the struct minimal.

Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:428-431
Timestamp: 2025-07-02T17:08:10.839Z
Learning: In lib/quantization/cpp/ SIMD functions, IvanPleshkov prefers to avoid memcpy in favor of direct pointer casts for type punning, prioritizing potential compiler optimization over strict aliasing rule compliance in performance-critical quantization code.

Learnt from: coszio
PR: #6528
File: lib/posting_list/src/view.rs:118-118
Timestamp: 2025-05-19T14:40:20.068Z
Learning: In the bitpacking crate, the BitPacker::decompress_strictly_sorted function takes an Option as its first parameter, which means using checked_sub(1) without unwrapping is intentional and correct.

lib/segment/src/vector_storage/quantized/quantized_vectors.rs (5)

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.

Learnt from: xzfc
PR: #6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In AsyncRawScorerImpl, the unwrap() call after read_vectors_async is intentional. The io_uring feature is experimental, and the code is designed to panic rather than silently fall back to a synchronous implementation if it fails, directing users to use the default IO implementation instead.

Learnt from: xzfc
PR: #6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In the AsyncRawScorerImpl implementation, the unwrap() call on read_vectors_async results is intentional, with an explanatory comment noting that this experimental feature is meant to crash rather than silently fall back to synchronous implementation.

lib/segment/src/index/hnsw_index/point_scorer.rs (5)

Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Learnt from: xzfc
PR: #6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In AsyncRawScorerImpl, the unwrap() call after read_vectors_async is intentional. The io_uring feature is experimental, and the code is designed to panic rather than silently fall back to a synchronous implementation if it fails, directing users to use the default IO implementation instead.

Learnt from: xzfc
PR: #6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In the AsyncRawScorerImpl implementation, the unwrap() call on read_vectors_async results is intentional, with an explanatory comment noting that this experimental feature is meant to crash rather than silently fall back to synchronous implementation.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.

lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (4)

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:122-127
Timestamp: 2025-07-02T16:42:22.247Z
Learning: In lib/quantization/cpp/sse.c asymmetric binary quantization functions, the argument order in _mm_set_epi32 calls is intentionally designed to match query pointer offsets with corresponding bit shift factors. The "numbers" refer to logical indices that correspond to weighting factors (1, 2, 4, 8, etc. representing bit shifts), not SIMD register positions.

lib/quantization/src/encoded_vectors_u8.rs (5)

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:428-431
Timestamp: 2025-07-02T17:08:10.839Z
Learning: In lib/quantization/cpp/ SIMD functions, IvanPleshkov prefers to avoid memcpy in favor of direct pointer casts for type punning, prioritizing potential compiler optimization over strict aliasing rule compliance in performance-critical quantization code.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:122-127
Timestamp: 2025-07-02T16:42:22.247Z
Learning: In lib/quantization/cpp/sse.c asymmetric binary quantization functions, the argument order in _mm_set_epi32 calls is intentionally designed to match query pointer offsets with corresponding bit shift factors. The "numbers" refer to logical indices that correspond to weighting factors (1, 2, 4, 8, etc. representing bit shifts), not SIMD register positions.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.

Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.

lib/quantization/src/encoded_vectors_pq.rs (4)

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:122-127
Timestamp: 2025-07-02T16:42:22.247Z
Learning: In lib/quantization/cpp/sse.c asymmetric binary quantization functions, the argument order in _mm_set_epi32 calls is intentionally designed to match query pointer offsets with corresponding bit shift factors. The "numbers" refer to logical indices that correspond to weighting factors (1, 2, 4, 8, etc. representing bit shifts), not SIMD register positions.

Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:428-431
Timestamp: 2025-07-02T17:08:10.839Z
Learning: In lib/quantization/cpp/ SIMD functions, IvanPleshkov prefers to avoid memcpy in favor of direct pointer casts for type punning, prioritizing potential compiler optimization over strict aliasing rule compliance in performance-critical quantization code.

Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.

🧬 Code Graph Analysis (2)
lib/quantization/src/encoded_vectors.rs (1)
lib/segment/src/vector_storage/query_scorer/metric_query_scorer.rs (3)
  • score (93-96)
  • score_internal (98-103)
  • score_stored (69-73)
lib/segment/src/vector_storage/query_scorer/mod.rs (5)
lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs (1)
  • score_bytes (105-108)
lib/segment/src/vector_storage/quantized/quantized_custom_query_scorer.rs (1)
  • score_bytes (121-126)
lib/segment/src/vector_storage/query/mod.rs (1)
  • Query (29-33)
lib/segment/src/vector_storage/query_scorer/custom_query_scorer.rs (1)
  • function (49-55)
lib/segment/src/vector_storage/raw_scorer.rs (1)
  • RawScorer (31-43)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: lint
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: e2e-tests
  • GitHub Check: test-consistency
  • GitHub Check: integration-tests-consensus
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests
🔇 Additional comments (23)
lib/segment/src/vector_storage/mod.rs (1)

22-22: LGTM! Module visibility change enables access to new trait.

Making the query_scorer module public is necessary to expose the new QueryScorerBytes trait to external consumers. This change appropriately supports the byte-based scoring functionality being introduced.

lib/quantization/src/lib.rs (1)

14-14: LGTM! Proper public re-export of the new trait.

Adding EncodedVectorsBytes to the public re-exports follows the established pattern and correctly exposes the new trait at the crate root level for external consumers.

lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs (2)

10-10: LGTM! Proper import of the new trait.

The import of QueryScorerBytes is correctly added to support the new trait implementation.


101-109: LGTM! Clean implementation of byte-based scoring.

The QueryScorerBytes implementation correctly:

  • Uses appropriate trait bounds requiring EncodedVectorsBytes
  • Delegates to the underlying score_point_vs_bytes method
  • Properly passes through the hardware counter for performance tracking
  • Follows the established delegation pattern used in other scorer methods
lib/segment/src/vector_storage/quantized/quantized_custom_query_scorer.rs (1)

113-127: LGTM! Clean implementation of byte-based scoring.

The QueryScorerBytes implementation correctly extends the custom query scorer to support scoring against raw byte slices, maintaining consistency with the existing pattern.

lib/quantization/src/encoded_vectors_binary.rs (2)

834-845: Clean refactoring of scoring logic.

The extraction of scoring logic to score_point_vs_bytes properly separates data retrieval from computation, enabling reuse while maintaining the original behavior.


884-913: Well-structured trait implementation for byte-based scoring.

The EncodedVectorsBytes implementation correctly preserves all scoring logic while operating on raw bytes. The hardware counter tracking and query type dispatching are properly maintained.

lib/segment/src/vector_storage/quantized/quantized_vectors.rs (1)

231-245: Good addition of byte-based scorer API.

The query_scorer_bytes method provides a clean, consistent API for creating scorers that operate on raw byte slices, properly mirroring the existing raw_scorer pattern.

lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (2)

89-134: Clean implementation of byte-based scorer builder.

The method properly handles all datatype and distance metric combinations, maintaining consistency with the existing builder pattern.


287-356: Well-structured helper for byte-based scorer construction.

The method properly handles all query vector types and correctly constructs the appropriate scorer instances for byte-based operations.

lib/segment/src/index/hnsw_index/point_scorer.rs (4)

15-15: LGTM!

The import is properly placed and follows the existing import organization pattern.


40-56: Well-structured generic refactoring.

The generic struct with type aliases maintains backward compatibility while enabling support for both RawScorer and QueryScorerBytes. Good use of the type system.


58-68: LGTM!

Good refactoring to move the common check_vector method to the generic implementation, avoiding code duplication.


106-106: LGTM!

The implementation signature is correctly updated to work with the new generic structure.

lib/quantization/src/encoded_vectors_pq.rs (4)

16-18: LGTM!

Import changes are appropriate for the new trait implementation.


346-346: Good separation of concerns.

Refactoring the scoring methods to accept byte slices instead of indices properly separates data retrieval from scoring logic.

Also applies to: 381-381, 413-413


502-507: LGTM!

Clean refactoring that maintains the existing interface while delegating to the new byte-based scoring method.


567-590: Well-implemented trait with proper CPU feature detection.

The implementation correctly centralizes CPU feature detection and properly tracks hardware counter metrics. Good fallback pattern to simple implementation when SIMD features are unavailable.

lib/quantization/src/encoded_vectors_u8.rs (5)

11-11: LGTM!

Import addition is appropriate for the new trait implementation.


256-263: LGTM!

Clean refactoring that properly delegates to the new helper method.


378-384: LGTM!

Correct implementation that properly calculates data size and delegates to the trait method.


386-461: LGTM!

The refactored implementation maintains correct offset calculations and hardware counter tracking while using the new helper methods.


463-493: LGTM!

Both methods are correctly implemented with proper size calculations and offset handling for different distance types.

Comment on lines +247 to 254
fn parse_vec_data(data: &[u8]) -> (f32, *const u8) {
debug_assert!(data.len() >= std::mem::size_of::<f32>());
unsafe {
let vector_data_size = self.quantized_vector_size();
let v_ptr = self
.encoded_vectors
.get_vector_data(i as usize, vector_data_size)
.as_ptr();
let vector_offset = *v_ptr.cast::<f32>();
(vector_offset, v_ptr.add(std::mem::size_of::<f32>()))
let offset = data.as_ptr().cast::<f32>().read_unaligned();
let v_ptr = data.as_ptr().add(std::mem::size_of::<f32>());
(offset, v_ptr)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add safety documentation for the unsafe operations.

The method uses unsafe pointer operations but lacks safety documentation. The debug_assert only validates in debug builds.

Add safety documentation:

     #[inline]
+    /// Parse vector data to extract offset and data pointer.
+    ///
+    /// # Safety
+    /// Caller must ensure that `data` has at least `size_of::<f32>()` bytes.
     fn parse_vec_data(data: &[u8]) -> (f32, *const u8) {
         debug_assert!(data.len() >= std::mem::size_of::<f32>());

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In lib/quantization/src/encoded_vectors_u8.rs around lines 247 to 254, the
parse_vec_data function uses unsafe pointer operations without safety
documentation. Add a safety comment above the function explaining the
assumptions and invariants required for the unsafe code to be valid, such as the
minimum length of the data slice, alignment considerations, and that the data
pointer must be valid for reads of the required size.

Comment on lines +89 to +103
pub fn score_points(
&self,
points: &mut Vec<(PointOffsetType, &[u8])>,
limit: usize,
) -> impl Iterator<Item = ScoredPointOffset> {
points.retain(|(point_id, _)| self.check_vector(*point_id));
if limit != 0 {
points.truncate(limit);
}

points.iter().map(|&(idx, bytes)| ScoredPointOffset {
idx,
score: self.raw_scorer.score_bytes(bytes),
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Document the mutation of the input parameter.

The score_points method mutates the points parameter by calling retain. This side effect should be clearly documented in the method's documentation to prevent unexpected behavior for callers.

Add documentation above the method:

+    /// Score filtered points using byte-based scoring.
+    ///
+    /// # Arguments
+    /// * `points` - Vector of point IDs with their byte representations.
+    ///   **Warning**: This vector will be mutated (filtered) during execution.
+    /// * `limit` - Maximum number of points to score after filtering. 0 means no limit.
     pub fn score_points(
         &self,
         points: &mut Vec<(PointOffsetType, &[u8])>,
         limit: usize,
     ) -> impl Iterator<Item = ScoredPointOffset> {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn score_points(
&self,
points: &mut Vec<(PointOffsetType, &[u8])>,
limit: usize,
) -> impl Iterator<Item = ScoredPointOffset> {
points.retain(|(point_id, _)| self.check_vector(*point_id));
if limit != 0 {
points.truncate(limit);
}
points.iter().map(|&(idx, bytes)| ScoredPointOffset {
idx,
score: self.raw_scorer.score_bytes(bytes),
})
}
/// Score filtered points using byte-based scoring.
///
/// # Arguments
/// * `points` - Vector of point IDs with their byte representations.
/// **Warning**: This vector will be mutated (filtered) during execution.
/// * `limit` - Maximum number of points to score after filtering. 0 means no limit.
pub fn score_points(
&self,
points: &mut Vec<(PointOffsetType, &[u8])>,
limit: usize,
) -> impl Iterator<Item = ScoredPointOffset> {
points.retain(|(point_id, _)| self.check_vector(*point_id));
if limit != 0 {
points.truncate(limit);
}
points.iter().map(|&(idx, bytes)| ScoredPointOffset {
idx,
score: self.raw_scorer.score_bytes(bytes),
})
}
🤖 Prompt for AI Agents
In lib/segment/src/index/hnsw_index/point_scorer.rs around lines 89 to 103, the
score_points method mutates its input parameter points by retaining only certain
elements, which is a side effect that should be clearly documented. Add a doc
comment above the method explaining that the points vector is modified in place
by filtering out points that do not pass the check_vector condition, so callers
are aware of this mutation.

Comment on lines +40 to +42
pub trait QueryScorerBytes {
fn score_bytes(&self, bytes: &[u8]) -> ScoreType;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a bunch of methods in the QueryScorer trait, but not all scorers implement all of the methods; some just panic! for unsupported methods. I don't like this approach, and for this new method I took a different one.

Instead of adding a new, potentially panicking method, I've added a new trait. Not all scorers implement this trait, but those that do don't panic.

Not completely happy with this approach either, but let's see how would it go.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic idea here is to use the custom provided data instead of VectorStorage. I propose to make it generic and implement this trait also for regular dense vectors.

Copy link
Contributor

@IvanPleshkov IvanPleshkov Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary, just an idea if it's easy to implement. Not necessary, just for the art of the qdrant design =)

Comment on lines +40 to +42
pub trait QueryScorerBytes {
fn score_bytes(&self, bytes: &[u8]) -> ScoreType;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic idea here is to use the custom provided data instead of VectorStorage. I propose to make it generic and implement this trait also for regular dense vectors.

QuantizedVectorStorage::BinaryMmap(storage) => {
self.new_quantized_scorer2::<TElement, TMetric>(storage)
}
QuantizedVectorStorage::ScalarRamMulti(_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we skip multivector support? Is it temporary? In case of multivectors, even if we need to make additional disk reads, a new upcoming feature may be more efficient than search in mmap vector storage using regular hnsw

})
}

pub fn score_points(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function used? I can't find any references

}
}

fn new_quantized_scorer2<TElement, TMetric>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference with new_quantized_scorer? (maybe requires a commend and better naming)

Copy link
Member

@generall generall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please rename function for new quantized scorer

@xzfc xzfc force-pushed the hnsw-links-FilteredQuantizedScorer branch from 37a207a to ff3a93e Compare July 31, 2025 00:18
@xzfc xzfc merged commit 83005cd into dev Jul 31, 2025
16 checks passed
@xzfc xzfc deleted the hnsw-links-FilteredQuantizedScorer branch July 31, 2025 03:05
timvisee pushed a commit that referenced this pull request Aug 11, 2025
* Add EncodedVectorsBytes::score_point_vs_bytes

* Add QueryScorerBytes::score_bytes

* QuantizedVectors::query_scorer_bytes

* Split FilteredScorer into FilteredScorer and FilteredQuantizedScorer
@coderabbitai coderabbitai bot mentioned this pull request Oct 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants