Conversation
📝 WalkthroughWalkthroughThis change introduces a new trait, Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (2)
🧰 Additional context used🧠 Learnings (3)📓 Common learningslib/segment/src/index/hnsw_index/point_scorer.rs (5)Learnt from: generall Learnt from: xzfc Learnt from: xzfc Learnt from: coszio Learnt from: IvanPleshkov lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (4)Learnt from: coszio Learnt from: generall Learnt from: IvanPleshkov Learnt from: IvanPleshkov 🧬 Code Graph Analysis (1)lib/segment/src/index/hnsw_index/point_scorer.rs (3)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
🔇 Additional comments (10)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
lib/quantization/src/encoded_vectors.rs (1)
55-62: Consider adding documentation for the new trait and method.The
EncodedVectorsBytestrait and itsscore_point_vs_bytesmethod lack documentation. Adding doc comments would help clarify the purpose and usage of this byte-based scoring interface.Consider adding documentation like this:
+/// Extension trait for `EncodedVectors` that enables scoring directly against raw byte slices. +/// This is useful for scoring quantized vectors stored as bytes without requiring indexed access. pub trait EncodedVectorsBytes: EncodedVectors { + /// Score a query against an encoded vector represented as a raw byte slice. + /// + /// # Parameters + /// - `query`: The encoded query to score against + /// - `bytes`: Raw byte slice representing an encoded vector + /// - `hw_counter`: Hardware counter for performance tracking + /// + /// # Returns + /// Similarity score between the query and the encoded vector fn score_point_vs_bytes(lib/segment/src/vector_storage/query_scorer/mod.rs (1)
40-42: Consider adding documentation for the new trait.The
QueryScorerBytestrait lacks documentation. Adding doc comments would help clarify its purpose and relationship to the existingQueryScorertrait.Consider adding documentation like this:
+/// Trait for scoring queries directly against raw byte slices representing encoded vectors. +/// This complements the existing `QueryScorer` trait by enabling scoring on byte-encoded data +/// without requiring indexed vector storage access. pub trait QueryScorerBytes { + /// Score a query against an encoded vector represented as a raw byte slice. + /// + /// # Parameters + /// - `bytes`: Raw byte slice representing an encoded vector + /// + /// # Returns + /// Similarity score between the query and the encoded vector fn score_bytes(&self, bytes: &[u8]) -> ScoreType;lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (1)
181-214: Appropriate handling of multi-vector cases.The method correctly returns an error for multi-vector storage types, which aligns with the PR objectives stating that multi-vector support is not included. Consider adding a TODO comment to track future multi-vector support.
Consider adding a comment to document the limitation:
QuantizedVectorStorage::ScalarRamMulti(_) | QuantizedVectorStorage::ScalarMmapMulti(_) | QuantizedVectorStorage::PQRamMulti(_) | QuantizedVectorStorage::PQMmapMulti(_) | QuantizedVectorStorage::BinaryRamMulti(_) - | QuantizedVectorStorage::BinaryMmapMulti(_) => Err(OperationError::WrongMulti), + | QuantizedVectorStorage::BinaryMmapMulti(_) => { + // TODO: Add multi-vector support for QueryScorerBytes + Err(OperationError::WrongMulti) + }lib/quantization/src/encoded_vectors_u8.rs (1)
495-568: Consider documenting the data size precondition.The implementation is well-structured with proper CPU feature detection and hardware counter tracking. However, the debug_assert at line 506 only validates in debug builds.
Add a comment about the expected data format:
) -> f32 { hw_counter .cpu_counter() .incr_delta(self.metadata.vector_parameters.dim); + // Expects bytes to contain: [f32 offset][u8 vector data...] debug_assert!(bytes.len() >= std::mem::size_of::<f32>() + self.metadata.actual_dim);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
lib/quantization/src/encoded_vectors.rs(1 hunks)lib/quantization/src/encoded_vectors_binary.rs(3 hunks)lib/quantization/src/encoded_vectors_pq.rs(6 hunks)lib/quantization/src/encoded_vectors_u8.rs(10 hunks)lib/quantization/src/lib.rs(1 hunks)lib/segment/src/index/hnsw_index/point_scorer.rs(3 hunks)lib/segment/src/vector_storage/mod.rs(1 hunks)lib/segment/src/vector_storage/quantized/quantized_custom_query_scorer.rs(2 hunks)lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs(2 hunks)lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs(6 hunks)lib/segment/src/vector_storage/quantized/quantized_vectors.rs(3 hunks)lib/segment/src/vector_storage/query_scorer/mod.rs(1 hunks)
🧰 Additional context used
🧠 Learnings (13)
📓 Common learnings
Learnt from: coszio
PR: qdrant/qdrant#6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec<T>, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::<T>() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: xzfc
PR: qdrant/qdrant#6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In the AsyncRawScorerImpl implementation, the unwrap() call on read_vectors_async results is intentional, with an explanatory comment noting that this experimental feature is meant to crash rather than silently fall back to synchronous implementation.
Learnt from: xzfc
PR: qdrant/qdrant#6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In AsyncRawScorerImpl, the unwrap() call after read_vectors_async is intentional. The io_uring feature is experimental, and the code is designed to panic rather than silently fall back to a synchronous implementation if it fails, directing users to use the default IO implementation instead.
lib/segment/src/vector_storage/mod.rs (2)
Learnt from: timvisee
PR: #6503
File: lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs:60-74
Timestamp: 2025-05-12T12:54:27.872Z
Learning: In the Qdrant codebase, using pub(super) visibility is preferred when fields need to be accessed by sibling modules (particularly for index type conversions), as it provides the necessary access without bloating the interface with numerous getters.
Learnt from: xzfc
PR: #6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In AsyncRawScorerImpl, the unwrap() call after read_vectors_async is intentional. The io_uring feature is experimental, and the code is designed to panic rather than silently fall back to a synchronous implementation if it fails, directing users to use the default IO implementation instead.
lib/quantization/src/encoded_vectors.rs (2)
Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.
lib/segment/src/vector_storage/query_scorer/mod.rs (2)
Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.
lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs (2)
Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.
lib/quantization/src/lib.rs (2)
Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: timvisee
PR: #6503
File: lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs:60-74
Timestamp: 2025-05-12T12:54:27.872Z
Learning: In the Qdrant codebase, using pub(super) visibility is preferred when fields need to be accessed by sibling modules (particularly for index type conversions), as it provides the necessary access without bloating the interface with numerous getters.
lib/segment/src/vector_storage/quantized/quantized_custom_query_scorer.rs (2)
Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.
lib/quantization/src/encoded_vectors_binary.rs (7)
Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:122-127
Timestamp: 2025-07-02T16:42:22.247Z
Learning: In lib/quantization/cpp/sse.c asymmetric binary quantization functions, the argument order in _mm_set_epi32 calls is intentionally designed to match query pointer offsets with corresponding bit shift factors. The "numbers" refer to logical indices that correspond to weighting factors (1, 2, 4, 8, etc. representing bit shifts), not SIMD register positions.
Learnt from: coszio
PR: #6565
File: lib/posting_list/src/builder.rs:63-67
Timestamp: 2025-05-26T14:47:23.505Z
Learning: In the posting_list crate's PostingChunk struct, avoid adding extra fields like storing computed chunk_bits to prevent struct bloat and maintain mmap compatibility. The bit calculations from offsets are inexpensive compared to the memory and compatibility benefits of keeping the struct minimal.
Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:428-431
Timestamp: 2025-07-02T17:08:10.839Z
Learning: In lib/quantization/cpp/ SIMD functions, IvanPleshkov prefers to avoid memcpy in favor of direct pointer casts for type punning, prioritizing potential compiler optimization over strict aliasing rule compliance in performance-critical quantization code.
Learnt from: coszio
PR: #6528
File: lib/posting_list/src/view.rs:118-118
Timestamp: 2025-05-19T14:40:20.068Z
Learning: In the bitpacking crate, the BitPacker::decompress_strictly_sorted function takes an Option as its first parameter, which means using checked_sub(1) without unwrapping is intentional and correct.
lib/segment/src/vector_storage/quantized/quantized_vectors.rs (5)
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.
Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.
Learnt from: xzfc
PR: #6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In AsyncRawScorerImpl, the unwrap() call after read_vectors_async is intentional. The io_uring feature is experimental, and the code is designed to panic rather than silently fall back to a synchronous implementation if it fails, directing users to use the default IO implementation instead.
Learnt from: xzfc
PR: #6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In the AsyncRawScorerImpl implementation, the unwrap() call on read_vectors_async results is intentional, with an explanatory comment noting that this experimental feature is meant to crash rather than silently fall back to synchronous implementation.
lib/segment/src/index/hnsw_index/point_scorer.rs (5)
Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.
Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: xzfc
PR: #6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In AsyncRawScorerImpl, the unwrap() call after read_vectors_async is intentional. The io_uring feature is experimental, and the code is designed to panic rather than silently fall back to a synchronous implementation if it fails, directing users to use the default IO implementation instead.
Learnt from: xzfc
PR: #6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In the AsyncRawScorerImpl implementation, the unwrap() call on read_vectors_async results is intentional, with an explanatory comment noting that this experimental feature is meant to crash rather than silently fall back to synchronous implementation.
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.
lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (4)
Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:122-127
Timestamp: 2025-07-02T16:42:22.247Z
Learning: In lib/quantization/cpp/sse.c asymmetric binary quantization functions, the argument order in _mm_set_epi32 calls is intentionally designed to match query pointer offsets with corresponding bit shift factors. The "numbers" refer to logical indices that correspond to weighting factors (1, 2, 4, 8, etc. representing bit shifts), not SIMD register positions.
lib/quantization/src/encoded_vectors_u8.rs (5)
Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:428-431
Timestamp: 2025-07-02T17:08:10.839Z
Learning: In lib/quantization/cpp/ SIMD functions, IvanPleshkov prefers to avoid memcpy in favor of direct pointer casts for type punning, prioritizing potential compiler optimization over strict aliasing rule compliance in performance-critical quantization code.
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:122-127
Timestamp: 2025-07-02T16:42:22.247Z
Learning: In lib/quantization/cpp/sse.c asymmetric binary quantization functions, the argument order in _mm_set_epi32 calls is intentionally designed to match query pointer offsets with corresponding bit shift factors. The "numbers" refer to logical indices that correspond to weighting factors (1, 2, 4, 8, etc. representing bit shifts), not SIMD register positions.
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.
Learnt from: generall
PR: #6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using id.to_string().parse().unwrap() is acceptable for testing purposes and doesn't need to match production code's id_tracker.internal_id() approach. Test code can use mock implementations that serve the testing goals.
lib/quantization/src/encoded_vectors_pq.rs (4)
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/src/encoded_vectors_binary.rs:810-810
Timestamp: 2025-07-02T17:10:13.847Z
Learning: In the Qdrant quantization codebase, hardware counters (hw_counter.cpu_counter()) are used to measure vector data access from storage, not computational cost. With asymmetric binary quantization where query vectors can be longer than storage vectors, the counter should still track vector_data.len() to maintain consistent measurement of storage access patterns, not query processing overhead.
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:122-127
Timestamp: 2025-07-02T16:42:22.247Z
Learning: In lib/quantization/cpp/sse.c asymmetric binary quantization functions, the argument order in _mm_set_epi32 calls is intentionally designed to match query pointer offsets with corresponding bit shift factors. The "numbers" refer to logical indices that correspond to weighting factors (1, 2, 4, 8, etc. representing bit shifts), not SIMD register positions.
Learnt from: IvanPleshkov
PR: #6728
File: lib/quantization/cpp/sse.c:428-431
Timestamp: 2025-07-02T17:08:10.839Z
Learning: In lib/quantization/cpp/ SIMD functions, IvanPleshkov prefers to avoid memcpy in favor of direct pointer casts for type punning, prioritizing potential compiler optimization over strict aliasing rule compliance in performance-critical quantization code.
Learnt from: coszio
PR: #6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::() because it guarantees memory layout equals byte representation, making serialization safe and correct.
🧬 Code Graph Analysis (2)
lib/quantization/src/encoded_vectors.rs (1)
lib/segment/src/vector_storage/query_scorer/metric_query_scorer.rs (3)
score(93-96)score_internal(98-103)score_stored(69-73)
lib/segment/src/vector_storage/query_scorer/mod.rs (5)
lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs (1)
score_bytes(105-108)lib/segment/src/vector_storage/quantized/quantized_custom_query_scorer.rs (1)
score_bytes(121-126)lib/segment/src/vector_storage/query/mod.rs (1)
Query(29-33)lib/segment/src/vector_storage/query_scorer/custom_query_scorer.rs (1)
function(49-55)lib/segment/src/vector_storage/raw_scorer.rs (1)
RawScorer(31-43)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: lint
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: storage-compat-test
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
- GitHub Check: e2e-tests
- GitHub Check: test-consistency
- GitHub Check: integration-tests-consensus
- GitHub Check: test-consensus-compose
- GitHub Check: integration-tests
🔇 Additional comments (23)
lib/segment/src/vector_storage/mod.rs (1)
22-22: LGTM! Module visibility change enables access to new trait.Making the
query_scorermodule public is necessary to expose the newQueryScorerBytestrait to external consumers. This change appropriately supports the byte-based scoring functionality being introduced.lib/quantization/src/lib.rs (1)
14-14: LGTM! Proper public re-export of the new trait.Adding
EncodedVectorsBytesto the public re-exports follows the established pattern and correctly exposes the new trait at the crate root level for external consumers.lib/segment/src/vector_storage/quantized/quantized_query_scorer.rs (2)
10-10: LGTM! Proper import of the new trait.The import of
QueryScorerBytesis correctly added to support the new trait implementation.
101-109: LGTM! Clean implementation of byte-based scoring.The
QueryScorerBytesimplementation correctly:
- Uses appropriate trait bounds requiring
EncodedVectorsBytes- Delegates to the underlying
score_point_vs_bytesmethod- Properly passes through the hardware counter for performance tracking
- Follows the established delegation pattern used in other scorer methods
lib/segment/src/vector_storage/quantized/quantized_custom_query_scorer.rs (1)
113-127: LGTM! Clean implementation of byte-based scoring.The
QueryScorerBytesimplementation correctly extends the custom query scorer to support scoring against raw byte slices, maintaining consistency with the existing pattern.lib/quantization/src/encoded_vectors_binary.rs (2)
834-845: Clean refactoring of scoring logic.The extraction of scoring logic to
score_point_vs_bytesproperly separates data retrieval from computation, enabling reuse while maintaining the original behavior.
884-913: Well-structured trait implementation for byte-based scoring.The
EncodedVectorsBytesimplementation correctly preserves all scoring logic while operating on raw bytes. The hardware counter tracking and query type dispatching are properly maintained.lib/segment/src/vector_storage/quantized/quantized_vectors.rs (1)
231-245: Good addition of byte-based scorer API.The
query_scorer_bytesmethod provides a clean, consistent API for creating scorers that operate on raw byte slices, properly mirroring the existingraw_scorerpattern.lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (2)
89-134: Clean implementation of byte-based scorer builder.The method properly handles all datatype and distance metric combinations, maintaining consistency with the existing builder pattern.
287-356: Well-structured helper for byte-based scorer construction.The method properly handles all query vector types and correctly constructs the appropriate scorer instances for byte-based operations.
lib/segment/src/index/hnsw_index/point_scorer.rs (4)
15-15: LGTM!The import is properly placed and follows the existing import organization pattern.
40-56: Well-structured generic refactoring.The generic struct with type aliases maintains backward compatibility while enabling support for both
RawScorerandQueryScorerBytes. Good use of the type system.
58-68: LGTM!Good refactoring to move the common
check_vectormethod to the generic implementation, avoiding code duplication.
106-106: LGTM!The implementation signature is correctly updated to work with the new generic structure.
lib/quantization/src/encoded_vectors_pq.rs (4)
16-18: LGTM!Import changes are appropriate for the new trait implementation.
346-346: Good separation of concerns.Refactoring the scoring methods to accept byte slices instead of indices properly separates data retrieval from scoring logic.
Also applies to: 381-381, 413-413
502-507: LGTM!Clean refactoring that maintains the existing interface while delegating to the new byte-based scoring method.
567-590: Well-implemented trait with proper CPU feature detection.The implementation correctly centralizes CPU feature detection and properly tracks hardware counter metrics. Good fallback pattern to simple implementation when SIMD features are unavailable.
lib/quantization/src/encoded_vectors_u8.rs (5)
11-11: LGTM!Import addition is appropriate for the new trait implementation.
256-263: LGTM!Clean refactoring that properly delegates to the new helper method.
378-384: LGTM!Correct implementation that properly calculates data size and delegates to the trait method.
386-461: LGTM!The refactored implementation maintains correct offset calculations and hardware counter tracking while using the new helper methods.
463-493: LGTM!Both methods are correctly implemented with proper size calculations and offset handling for different distance types.
| fn parse_vec_data(data: &[u8]) -> (f32, *const u8) { | ||
| debug_assert!(data.len() >= std::mem::size_of::<f32>()); | ||
| unsafe { | ||
| let vector_data_size = self.quantized_vector_size(); | ||
| let v_ptr = self | ||
| .encoded_vectors | ||
| .get_vector_data(i as usize, vector_data_size) | ||
| .as_ptr(); | ||
| let vector_offset = *v_ptr.cast::<f32>(); | ||
| (vector_offset, v_ptr.add(std::mem::size_of::<f32>())) | ||
| let offset = data.as_ptr().cast::<f32>().read_unaligned(); | ||
| let v_ptr = data.as_ptr().add(std::mem::size_of::<f32>()); | ||
| (offset, v_ptr) | ||
| } | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Add safety documentation for the unsafe operations.
The method uses unsafe pointer operations but lacks safety documentation. The debug_assert only validates in debug builds.
Add safety documentation:
#[inline]
+ /// Parse vector data to extract offset and data pointer.
+ ///
+ /// # Safety
+ /// Caller must ensure that `data` has at least `size_of::<f32>()` bytes.
fn parse_vec_data(data: &[u8]) -> (f32, *const u8) {
debug_assert!(data.len() >= std::mem::size_of::<f32>());Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In lib/quantization/src/encoded_vectors_u8.rs around lines 247 to 254, the
parse_vec_data function uses unsafe pointer operations without safety
documentation. Add a safety comment above the function explaining the
assumptions and invariants required for the unsafe code to be valid, such as the
minimum length of the data slice, alignment considerations, and that the data
pointer must be valid for reads of the required size.
| pub fn score_points( | ||
| &self, | ||
| points: &mut Vec<(PointOffsetType, &[u8])>, | ||
| limit: usize, | ||
| ) -> impl Iterator<Item = ScoredPointOffset> { | ||
| points.retain(|(point_id, _)| self.check_vector(*point_id)); | ||
| if limit != 0 { | ||
| points.truncate(limit); | ||
| } | ||
|
|
||
| points.iter().map(|&(idx, bytes)| ScoredPointOffset { | ||
| idx, | ||
| score: self.raw_scorer.score_bytes(bytes), | ||
| }) | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Document the mutation of the input parameter.
The score_points method mutates the points parameter by calling retain. This side effect should be clearly documented in the method's documentation to prevent unexpected behavior for callers.
Add documentation above the method:
+ /// Score filtered points using byte-based scoring.
+ ///
+ /// # Arguments
+ /// * `points` - Vector of point IDs with their byte representations.
+ /// **Warning**: This vector will be mutated (filtered) during execution.
+ /// * `limit` - Maximum number of points to score after filtering. 0 means no limit.
pub fn score_points(
&self,
points: &mut Vec<(PointOffsetType, &[u8])>,
limit: usize,
) -> impl Iterator<Item = ScoredPointOffset> {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub fn score_points( | |
| &self, | |
| points: &mut Vec<(PointOffsetType, &[u8])>, | |
| limit: usize, | |
| ) -> impl Iterator<Item = ScoredPointOffset> { | |
| points.retain(|(point_id, _)| self.check_vector(*point_id)); | |
| if limit != 0 { | |
| points.truncate(limit); | |
| } | |
| points.iter().map(|&(idx, bytes)| ScoredPointOffset { | |
| idx, | |
| score: self.raw_scorer.score_bytes(bytes), | |
| }) | |
| } | |
| /// Score filtered points using byte-based scoring. | |
| /// | |
| /// # Arguments | |
| /// * `points` - Vector of point IDs with their byte representations. | |
| /// **Warning**: This vector will be mutated (filtered) during execution. | |
| /// * `limit` - Maximum number of points to score after filtering. 0 means no limit. | |
| pub fn score_points( | |
| &self, | |
| points: &mut Vec<(PointOffsetType, &[u8])>, | |
| limit: usize, | |
| ) -> impl Iterator<Item = ScoredPointOffset> { | |
| points.retain(|(point_id, _)| self.check_vector(*point_id)); | |
| if limit != 0 { | |
| points.truncate(limit); | |
| } | |
| points.iter().map(|&(idx, bytes)| ScoredPointOffset { | |
| idx, | |
| score: self.raw_scorer.score_bytes(bytes), | |
| }) | |
| } |
🤖 Prompt for AI Agents
In lib/segment/src/index/hnsw_index/point_scorer.rs around lines 89 to 103, the
score_points method mutates its input parameter points by retaining only certain
elements, which is a side effect that should be clearly documented. Add a doc
comment above the method explaining that the points vector is modified in place
by filtering out points that do not pass the check_vector condition, so callers
are aware of this mutation.
| pub trait QueryScorerBytes { | ||
| fn score_bytes(&self, bytes: &[u8]) -> ScoreType; | ||
| } |
There was a problem hiding this comment.
We have a bunch of methods in the QueryScorer trait, but not all scorers implement all of the methods; some just panic! for unsupported methods. I don't like this approach, and for this new method I took a different one.
Instead of adding a new, potentially panicking method, I've added a new trait. Not all scorers implement this trait, but those that do don't panic.
Not completely happy with this approach either, but let's see how would it go.
There was a problem hiding this comment.
The basic idea here is to use the custom provided data instead of VectorStorage. I propose to make it generic and implement this trait also for regular dense vectors.
There was a problem hiding this comment.
Not necessary, just an idea if it's easy to implement. Not necessary, just for the art of the qdrant design =)
| pub trait QueryScorerBytes { | ||
| fn score_bytes(&self, bytes: &[u8]) -> ScoreType; | ||
| } |
There was a problem hiding this comment.
The basic idea here is to use the custom provided data instead of VectorStorage. I propose to make it generic and implement this trait also for regular dense vectors.
| QuantizedVectorStorage::BinaryMmap(storage) => { | ||
| self.new_quantized_scorer2::<TElement, TMetric>(storage) | ||
| } | ||
| QuantizedVectorStorage::ScalarRamMulti(_) |
There was a problem hiding this comment.
Why do we skip multivector support? Is it temporary? In case of multivectors, even if we need to make additional disk reads, a new upcoming feature may be more efficient than search in mmap vector storage using regular hnsw
| }) | ||
| } | ||
|
|
||
| pub fn score_points( |
There was a problem hiding this comment.
Is this function used? I can't find any references
| } | ||
| } | ||
|
|
||
| fn new_quantized_scorer2<TElement, TMetric>( |
There was a problem hiding this comment.
What's the difference with new_quantized_scorer? (maybe requires a commend and better naming)
generall
left a comment
There was a problem hiding this comment.
LGTM, please rename function for new quantized scorer
37a207a to
ff3a93e
Compare
* Add EncodedVectorsBytes::score_point_vs_bytes * Add QueryScorerBytes::score_bytes * QuantizedVectors::query_scorer_bytes * Split FilteredScorer into FilteredScorer and FilteredQuantizedScorer
This PR implements scorers required for the vectors-in-graph feature.
It introduces a new trait
QueryScorerBytesthat allows scoring encoded query against vector encoded as&[u8].This new trait is integrated enough to provide two variations of
FilteredScorer:FilteredScorerthat is used to score against PointOffsetType (retrieved from the vector storage). No changes in logic here.FilteredQuantizedScorerto score against vector encoded as&[u8]. Later these vectors would be stored alongside graph links.Supported quantizations: binary, scalar, and product.
Supported only singular vectors, not multi-vectors.