Chunked mmap as a quantization storage#7116
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/segment/src/vector_storage/quantized/quantized_vectors.rs (1)
176-180: Include BinaryChunkedMmap in default_rescoring.
default_rescoring()returns true for Binary RAM/MMAP but omits the new BinaryChunkedMmap single-vector variant. Unless intentionally excluded, this likely should be included to preserve behavior parity.- matches!( - self.storage_impl, - QuantizedVectorStorage::BinaryRam(_) | QuantizedVectorStorage::BinaryMmap(_) - ) + matches!( + self.storage_impl, + QuantizedVectorStorage::BinaryRam(_) + | QuantizedVectorStorage::BinaryMmap(_) + | QuantizedVectorStorage::BinaryChunkedMmap(_) + )
🧹 Nitpick comments (3)
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs (1)
197-205: Flusher error mapping is OK, minor style simplification possible.Logic mirrors other storages and properly maps domain errors into
io::Error. Optional: use the same minimal form used elsewhere for consistency.- Box::new(move || { - flusher().map_err(|e| { - std::io::Error::other(format!("Failed to flush multivector offsets storage: {e}")) - })?; - Ok(()) - }) + Box::new(move || flusher().map_err(|e| { + std::io::Error::other(format!("Failed to flush multivector offsets storage: {e}")) + }))lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs (2)
14-22: Upsert error conversion is fine; optional size sanity check.Mapping domain errors to
io::Error::othermatches trait expectations. If you want extra safety, you could assert stored vector length consistency when replacing an existing entry (whenid < self.len()), but givenEncodedStoragelacks size metadata and encoders enforce sizes upstream, this is optional.
32-39: Flusher closure can be simplified (optional).Same outcome with a slightly cleaner closure.
- Box::new(move || { - Ok(flusher().map_err(|e| { - std::io::Error::other(format!("Failed to flush quantization storage: {e}")) - })?) - }) + Box::new(move || flusher().map_err(|e| { + std::io::Error::other(format!("Failed to flush quantization storage: {e}")) + }))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
lib/segment/src/vector_storage/quantized/mod.rs(1 hunks)lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs(1 hunks)lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs(2 hunks)lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs(1 hunks)lib/segment/src/vector_storage/quantized/quantized_vectors.rs(12 hunks)
🧰 Additional context used
🧠 Learnings (12)
📓 Common learnings
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.
Learnt from: IvanPleshkov
PR: qdrant/qdrant#6986
File: lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs:46-55
Timestamp: 2025-08-11T00:37:34.100Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs`, the `ChunkedVectors` used in `QuantizedRamStorage` is intentionally designed to be non-persistent during updates. The `push_vector` method only updates the in-memory vectors without flushing to disk, and this is expected behavior rather than a bug.
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_vectors.rs:857-857
Timestamp: 2025-08-15T11:41:10.629Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_vectors.rs`, the multivector offset storage has two different patterns: for RAM storage, offsets are collected into Vec<MultivectorOffset> and used directly; for MMAP storage, offsets are consumed to create a file via create_offsets_file_from_iter, then the file is loaded back as MultivectorOffsetsStorageMmap. The direct consumption of offsets iterator in the MMAP case is intentional.
Learnt from: IvanPleshkov
PR: qdrant/qdrant#6986
File: lib/segment/src/vector_storage/quantized/quantized_vectors.rs:81-84
Timestamp: 2025-08-11T07:57:01.399Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_vectors.rs`, the BitsStoreType parameter difference between single-vector and multi-vector Binary quantization is intentional: single-vector storage uses `EncodedVectorsBin<u128, ...>` to enable 128-bit SIMD/popcount optimizations for longer vectors, while multi-vector storage uses `EncodedVectorsBin<u8, ...>` because multivectors are typically shorter and benefit from byte-granular storage.
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7113
File: lib/quantization/src/encoded_vectors_u8.rs:202-211
Timestamp: 2025-08-21T13:43:52.139Z
Learning: The `quantized_vector_size()` method is available on the `EncodedVectors` trait, not on the `EncodedStorage` trait. The `EncodedStorage` trait provides methods like `get_vector_data`, `vectors_count`, `is_on_disk`, `upsert_vector`, `flusher`, `files`, and `immutable_files`.
📚 Learning: 2025-08-15T11:42:00.297Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.
Applied to files:
lib/segment/src/vector_storage/quantized/mod.rslib/segment/src/vector_storage/quantized/quantized_multivector_storage.rslib/segment/src/vector_storage/quantized/quantized_scorer_builder.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-08-11T00:37:34.100Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#6986
File: lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs:46-55
Timestamp: 2025-08-11T00:37:34.100Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs`, the `ChunkedVectors` used in `QuantizedRamStorage` is intentionally designed to be non-persistent during updates. The `push_vector` method only updates the in-memory vectors without flushing to disk, and this is expected behavior rather than a bug.
Applied to files:
lib/segment/src/vector_storage/quantized/mod.rslib/segment/src/vector_storage/quantized/quantized_multivector_storage.rslib/segment/src/vector_storage/quantized/quantized_scorer_builder.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-08-15T11:41:10.629Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_vectors.rs:857-857
Timestamp: 2025-08-15T11:41:10.629Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_vectors.rs`, the multivector offset storage has two different patterns: for RAM storage, offsets are collected into Vec<MultivectorOffset> and used directly; for MMAP storage, offsets are consumed to create a file via create_offsets_file_from_iter, then the file is loaded back as MultivectorOffsetsStorageMmap. The direct consumption of offsets iterator in the MMAP case is intentional.
Applied to files:
lib/segment/src/vector_storage/quantized/mod.rslib/segment/src/vector_storage/quantized/quantized_multivector_storage.rslib/segment/src/vector_storage/quantized/quantized_scorer_builder.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-08-11T07:57:01.399Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#6986
File: lib/segment/src/vector_storage/quantized/quantized_vectors.rs:81-84
Timestamp: 2025-08-11T07:57:01.399Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_vectors.rs`, the BitsStoreType parameter difference between single-vector and multi-vector Binary quantization is intentional: single-vector storage uses `EncodedVectorsBin<u128, ...>` to enable 128-bit SIMD/popcount optimizations for longer vectors, while multi-vector storage uses `EncodedVectorsBin<u8, ...>` because multivectors are typically shorter and benefit from byte-granular storage.
Applied to files:
lib/segment/src/vector_storage/quantized/mod.rslib/segment/src/vector_storage/quantized/quantized_multivector_storage.rslib/segment/src/vector_storage/quantized/quantized_scorer_builder.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-08-18T10:56:43.707Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs:340-347
Timestamp: 2025-08-18T10:56:43.707Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs, the create_offsets_file_from_iter function intentionally requires paths to have a parent directory and returns an error if path.parent() is None. This was a conscious design decision to ensure proper path validation.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rslib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-08-21T13:43:52.139Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7113
File: lib/quantization/src/encoded_vectors_u8.rs:202-211
Timestamp: 2025-08-21T13:43:52.139Z
Learning: The `quantized_vector_size()` method is available on the `EncodedVectors` trait, not on the `EncodedStorage` trait. The `EncodedStorage` trait provides methods like `get_vector_data`, `vectors_count`, `is_on_disk`, `upsert_vector`, `flusher`, `files`, and `immutable_files`.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-08-21T13:45:05.888Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7113
File: lib/quantization/src/encoded_vectors_binary.rs:515-525
Timestamp: 2025-08-21T13:45:05.888Z
Learning: The `EncodedStorage` trait in `lib/quantization/src/encoded_storage.rs` does not have a `quantized_vector_size()` method. To validate vector sizes from storage, use `get_vector_data(index).len()` to get the actual size of stored vectors.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-08-14T11:31:21.777Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7048
File: lib/quantization/src/encoded_storage.rs:61-79
Timestamp: 2025-08-14T11:31:21.777Z
Learning: In test storage implementations (like TestEncodedStorage in lib/quantization/src/encoded_storage.rs), IvanPleshkov prefers to keep the code simple rather than adding complex overflow protection, since test storage is not used in production and can be allowed to panic on edge cases.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-08-21T13:46:10.824Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7113
File: lib/quantization/src/encoded_vectors_u8.rs:202-211
Timestamp: 2025-08-21T13:46:10.824Z
Learning: In `EncodedStorage` trait implementations, calling `get_vector_data(0)` for validation purposes is not safe because the storage may be empty (vectors_count() == 0), which would cause issues when trying to access the first vector.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rslib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-04-07T23:31:22.515Z
Learnt from: generall
PR: qdrant/qdrant#6323
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:12-16
Timestamp: 2025-04-07T23:31:22.515Z
Learning: The `populate()` method in `QuantizedMmapStorage` correctly returns void instead of a `Result` because it directly implements the `Madviseable` trait which defines `populate(&self)` without a return type. Adding an unnecessary `Ok(())` return would trigger Clippy warnings about unnecessary Result wrapping.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-04-07T23:31:22.515Z
Learnt from: generall
PR: qdrant/qdrant#6323
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:12-16
Timestamp: 2025-04-07T23:31:22.515Z
Learning: The `populate()` method in `QuantizedMmapStorage` does not return a `Result` because the underlying `mmap.populate()` method doesn't return a Result either. Adding an unnecessary `Ok(())` would cause Clippy to complain.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rs
🧬 Code graph analysis (3)
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs (2)
lib/segment/src/vector_storage/chunked_mmap_vectors.rs (12)
get(270-272)get(361-363)len(178-180)len(351-353)flusher(320-332)flusher(380-382)insert(198-205)insert(394-401)files(334-342)files(370-372)immutable_files(344-346)immutable_files(375-377)lib/segment/src/vector_storage/chunked_vector_storage.rs (1)
insert(35-40)
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs (4)
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs (21)
std(533-533)upsert_vector(425-468)is_on_disk(371-373)vectors_count(346-348)vectors_count(470-472)flusher(43-43)flusher(107-109)flusher(171-174)flusher(197-205)flusher(474-482)new(252-264)files(45-45)files(111-113)files(176-178)files(217-219)files(484-488)immutable_files(47-47)immutable_files(115-117)immutable_files(180-182)immutable_files(221-223)immutable_files(490-494)lib/segment/src/vector_storage/quantized/quantized_vectors.rs (5)
upsert_vector(1366-1428)is_on_disk(131-152)flusher(1342-1364)files(359-382)immutable_files(384-407)lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (1)
new(35-53)lib/segment/src/vector_storage/chunked_vectors.rs (1)
quantization(172-223)
lib/segment/src/vector_storage/quantized/quantized_vectors.rs (9)
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs (2)
is_on_disk(371-373)storage(248-250)lib/quantization/src/encoded_vectors.rs (1)
is_on_disk(40-40)lib/quantization/src/encoded_vectors_pq.rs (3)
is_on_disk(495-497)storage(50-52)get_quantized_vector(462-464)lib/quantization/src/encoded_vectors_binary.rs (13)
is_on_disk(854-856)storage(403-405)get_quantized_vector(802-804)vector(217-217)vector(225-225)vector(238-238)vector(246-246)vector(324-324)vector(332-332)vector(345-345)vector(353-353)vector(366-366)vector(374-374)lib/quantization/src/encoded_vectors_u8.rs (3)
is_on_disk(373-375)storage(42-44)get_quantized_vector(315-317)lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs (2)
is_on_disk(59-61)build(99-116)lib/segment/src/vector_storage/vector_storage_base.rs (3)
is_on_disk(48-48)is_on_disk(722-766)populate(532-577)lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs (2)
build(106-114)populate(19-21)lib/segment/src/vector_storage/chunked_mmap_vectors.rs (1)
populate(447-452)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: e2e-tests
- GitHub Check: integration-tests
- GitHub Check: test-consistency
- GitHub Check: test-consensus-compose
- GitHub Check: integration-tests-consensus
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
- GitHub Check: storage-compat-test
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: lint
🔇 Additional comments (11)
lib/segment/src/vector_storage/quantized/mod.rs (1)
1-1: Module wiring looks correct.Private module added and ordered consistently with the rest. No API surface changes.
lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs (1)
100-147: New storage variants are correctly dispatched in scorer builder.All six ChunkedMmap variants route to the same scorer constructors as their RAM/MMAP peers. This maintains parity and avoids code paths divergence.
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs (1)
225-229: Unreachable load is acceptable as a temporary guard.Given the current architecture does not support loading
ChunkedMmapVectors<MultivectorOffset>directly from a single file, keeping thisunreachable!with a comment is fine until the migration in #7113 lands.lib/segment/src/vector_storage/quantized/quantized_vectors.rs (8)
104-108: Binary multivector uses u8 bitstore with ChunkedMmap — correct.The
BinaryChunkedMmapMultialias keeps the intentionalu8bitstore for multivectors (vsu128for single), consistent with prior design and perf rationale.
112-128: New ChunkedMmap variants are added consistently across the enum.Variant naming and parameterization match existing RAM/MMAP patterns and downstream usage sites (files, flusher, populate, etc.).
1299-1330: Populate calls cover quantized data; offsets remain unpopulated (by design?).For ChunkedMmap multivector storages,
populate()is invoked for the quantized data viaChunkedVectorStorage::populate(...), but offsets (also chunked mmap) are not populated. The offsets trait doesn’t exposepopulate, so this mirrors existing MMAP behavior for offsets. If you expect sequential scans over offsets on startup, consider a follow-up to either:
- add an (optional)
populate()toMultivectorOffsetsStorage, or- explicitly populate offsets where the concrete storage type is known.
359-382: Files list inclusion looks complete for new variants.All six ChunkedMmap variants are correctly delegated in
files(). Same applies toimmutable_files()below.
1346-1362: Flusher wiring includes all new variants.Delegation is exhaustive and consistent with existing variants.
1379-1426: Upsert dispatch covers all new variants.Dense and multivector upserts route appropriately; no gaps found.
212-231: Layout retrieval correctly errors for all multivector variants.New multivector ChunkedMmap variants are included in the error branch; single-vector variants delegate to their storages’
layout()— good.
28-30: Imports for ChunkedMmap are scoped right.Brings in both
ChunkedMmapVectorsandChunkedVectorStoragefor populate paths; matches usage below.
| fn get_vector_data(&self, index: PointOffsetType) -> &[u8] { | ||
| ChunkedVectorStorage::get(self, index as VectorOffsetType).unwrap_or_default() | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Don’t silently return an empty slice on out-of-bounds vector access.
Returning [] hides logic errors and can cascade into incorrect scoring or layout checks. Since the trait can’t return Result, assert in debug and panic in release for out-of-bounds to match MMAP/RAM behaviors.
- fn get_vector_data(&self, index: PointOffsetType) -> &[u8] {
- ChunkedVectorStorage::get(self, index as VectorOffsetType).unwrap_or_default()
- }
+ fn get_vector_data(&self, index: PointOffsetType) -> &[u8] {
+ let opt = ChunkedVectorStorage::get(self, index as VectorOffsetType);
+ debug_assert!(
+ opt.is_some() || (index as usize) >= self.len(),
+ "quantized vector index {} out of bounds (len = {})",
+ index,
+ self.len()
+ );
+ opt.unwrap_or_else(|| {
+ panic!(
+ "quantized vector index {} out of bounds (len = {})",
+ index,
+ self.len()
+ )
+ })
+ }🤖 Prompt for AI Agents
In lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
around lines 10 to 12, replace the current silent fallback (unwrap_or_default)
with a debug assertion and a release panic: add a debug_assert that the
requested index is within bounds, then call
ChunkedVectorStorage::get(...).expect(...) (or unwrap()) with a clear error
message so out-of-bounds access panics in release and asserts in debug rather
than returning an empty slice.
| fn get_offset(&self, idx: PointOffsetType) -> MultivectorOffset { | ||
| ChunkedVectorStorage::get(self, idx as VectorOffsetType) | ||
| .and_then(|offsets| offsets.first()) | ||
| .cloned() | ||
| .unwrap_or_default() | ||
| } |
There was a problem hiding this comment.
🛠️ Refactor suggestion
Avoid silently defaulting to zero offsets; align with RAM/MMAP semantics.
Current implementation returns MultivectorOffset::default() when an offset is missing. RAM/MMAP variants index directly and panic on out-of-bounds, which surfaces bugs early. Returning a default here can mask data corruption and yield incorrect scoring (effectively a zero-length inner set).
Recommend asserting presence and panicking if absent to match existing semantics.
- fn get_offset(&self, idx: PointOffsetType) -> MultivectorOffset {
- ChunkedVectorStorage::get(self, idx as VectorOffsetType)
- .and_then(|offsets| offsets.first())
- .cloned()
- .unwrap_or_default()
- }
+ fn get_offset(&self, idx: PointOffsetType) -> MultivectorOffset {
+ ChunkedVectorStorage::get(self, idx as VectorOffsetType)
+ .and_then(|offsets| offsets.first().cloned())
+ .unwrap_or_else(|| {
+ panic!(
+ "Multivector offset for idx {} not found (len = {})",
+ idx,
+ ChunkedVectorStorage::len(self)
+ )
+ })
+ }Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs
around lines 186 to 191, the current get_offset silently returns
MultivectorOffset::default() when no offsets are found which can mask
out-of-bounds/data corruption; change this to fail fast to match RAM/MMAP
semantics by replacing the unwrap_or_default with a panic/assert (e.g. use
expect or unwrap with a clear message) that includes the requested idx (and/or
VectorOffsetType) so missing offsets cause an immediate panic and surface the
bug.
c473a0f to
c46759c
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/segment/src/vector_storage/quantized/quantized_vectors.rs (1)
175-180: Include BinaryChunkedMmap in default rescoring decision.Binary storage default-rescores for RAM/MMAP; excluding
BinaryChunkedMmapchanges behavior for the new backend only. Add it for parity.pub fn default_rescoring(&self) -> bool { matches!( self.storage_impl, - QuantizedVectorStorage::BinaryRam(_) | QuantizedVectorStorage::BinaryMmap(_) + QuantizedVectorStorage::BinaryRam(_) + | QuantizedVectorStorage::BinaryMmap(_) + | QuantizedVectorStorage::BinaryChunkedMmap(_) ) }
♻️ Duplicate comments (1)
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs (1)
10-12: Don’t return an empty slice on OOB; assert in debug and panic in release.Returning
[]hides logic errors and will cascade into bad scorer behavior. Match existing MMAP/RAM semantics: assert in debug, panic in release with a clear message.Apply:
- fn get_vector_data(&self, index: PointOffsetType) -> &[u8] { - ChunkedVectorStorage::get(self, index as VectorOffsetType).unwrap_or_default() - } + fn get_vector_data(&self, index: PointOffsetType) -> &[u8] { + let opt = ChunkedVectorStorage::get(self, index as VectorOffsetType); + debug_assert!( + opt.is_some() || (index as usize) >= self.len(), + "quantized vector index {} out of bounds (len = {})", + index, + self.len() + ); + opt.unwrap_or_else(|| { + panic!( + "quantized vector index {} out of bounds (len = {})", + index, + self.len() + ) + }) + }
🧹 Nitpick comments (3)
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs (2)
32-39: Flusher closure: simplify and align style with multivector offsets storage.Current
Ok(flusher().map_err(...)?)is harder to read. Mirror the pattern used in multivector offsets flusher for consistency.fn flusher(&self) -> MmapFlusher { let flusher = self.flusher(); Box::new(move || { - Ok(flusher().map_err(|e| { - std::io::Error::other(format!("Failed to flush quantization storage: {e}")) - })?) + flusher().map_err(|e| { + std::io::Error::other(format!("Failed to flush quantization storage: {e}")) + })?; + Ok(()) }) }
14-22: Include context in upsert errors (id, expected vs. actual lengths if available).Mapping to
io::Error::otheris fine, but adding minimal context greatly helps diagnosing partial chunk flush issues during upserts.- ChunkedVectorStorage::insert(self, id as VectorOffsetType, vector, hw_counter) - .map_err(std::io::Error::other) + ChunkedVectorStorage::insert(self, id as VectorOffsetType, vector, hw_counter) + .map_err(|err| std::io::Error::other(format!("upsert id={id}: {err}")))Note: if
ChunkedVectorStorageexposes vector-length metadata, consider a debug_assert onvector.len()to catch mismatches early in debug builds.lib/segment/src/vector_storage/quantized/quantized_vectors.rs (1)
754-1010: Note: load/create still target RAM/MMAP backends only (no chunked).This matches the PR description (compile-time wiring first, load/create to follow). Consider adding a brief TODO in the builder paths to avoid future regressions when chunked loaders arrive.
Would you like a follow-up patch that adds TODOs near
create_*/load_*noting that chunked mmap backends are only constructed by external builders today?Also applies to: 1121-1395
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (6)
lib/segment/src/index/hnsw_index/gpu/gpu_vector_storage/mod.rs(6 hunks)lib/segment/src/vector_storage/quantized/mod.rs(1 hunks)lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs(1 hunks)lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs(2 hunks)lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs(1 hunks)lib/segment/src/vector_storage/quantized/quantized_vectors.rs(12 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- lib/segment/src/vector_storage/quantized/quantized_scorer_builder.rs
- lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs
- lib/segment/src/index/hnsw_index/gpu/gpu_vector_storage/mod.rs
- lib/segment/src/vector_storage/quantized/mod.rs
🧰 Additional context used
🧠 Learnings (14)
📓 Common learnings
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_vectors.rs:857-857
Timestamp: 2025-08-15T11:41:10.629Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_vectors.rs`, the multivector offset storage has two different patterns: for RAM storage, offsets are collected into Vec<MultivectorOffset> and used directly; for MMAP storage, offsets are consumed to create a file via create_offsets_file_from_iter, then the file is loaded back as MultivectorOffsetsStorageMmap. The direct consumption of offsets iterator in the MMAP case is intentional.
Learnt from: IvanPleshkov
PR: qdrant/qdrant#6986
File: lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs:46-55
Timestamp: 2025-08-11T00:37:34.100Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs`, the `ChunkedVectors` used in `QuantizedRamStorage` is intentionally designed to be non-persistent during updates. The `push_vector` method only updates the in-memory vectors without flushing to disk, and this is expected behavior rather than a bug.
Learnt from: IvanPleshkov
PR: qdrant/qdrant#6986
File: lib/segment/src/vector_storage/quantized/quantized_vectors.rs:81-84
Timestamp: 2025-08-11T07:57:01.399Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_vectors.rs`, the BitsStoreType parameter difference between single-vector and multi-vector Binary quantization is intentional: single-vector storage uses `EncodedVectorsBin<u128, ...>` to enable 128-bit SIMD/popcount optimizations for longer vectors, while multi-vector storage uses `EncodedVectorsBin<u8, ...>` because multivectors are typically shorter and benefit from byte-granular storage.
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7113
File: lib/quantization/src/encoded_vectors_u8.rs:202-211
Timestamp: 2025-08-21T13:43:52.139Z
Learning: The `quantized_vector_size()` method is available on the `EncodedVectors` trait, not on the `EncodedStorage` trait. The `EncodedStorage` trait provides methods like `get_vector_data`, `vectors_count`, `is_on_disk`, `upsert_vector`, `flusher`, `files`, and `immutable_files`.
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7113
File: lib/quantization/src/encoded_vectors_binary.rs:515-525
Timestamp: 2025-08-21T13:45:05.888Z
Learning: The `EncodedStorage` trait in `lib/quantization/src/encoded_storage.rs` does not have a `quantized_vector_size()` method. To validate vector sizes from storage, use `get_vector_data(index).len()` to get the actual size of stored vectors.
📚 Learning: 2025-08-15T11:42:00.297Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
📚 Learning: 2025-08-15T11:41:10.629Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_vectors.rs:857-857
Timestamp: 2025-08-15T11:41:10.629Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_vectors.rs`, the multivector offset storage has two different patterns: for RAM storage, offsets are collected into Vec<MultivectorOffset> and used directly; for MMAP storage, offsets are consumed to create a file via create_offsets_file_from_iter, then the file is loaded back as MultivectorOffsetsStorageMmap. The direct consumption of offsets iterator in the MMAP case is intentional.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
📚 Learning: 2025-08-11T00:37:34.100Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#6986
File: lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs:46-55
Timestamp: 2025-08-11T00:37:34.100Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs`, the `ChunkedVectors` used in `QuantizedRamStorage` is intentionally designed to be non-persistent during updates. The `push_vector` method only updates the in-memory vectors without flushing to disk, and this is expected behavior rather than a bug.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
📚 Learning: 2025-08-11T07:57:01.399Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#6986
File: lib/segment/src/vector_storage/quantized/quantized_vectors.rs:81-84
Timestamp: 2025-08-11T07:57:01.399Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_vectors.rs`, the BitsStoreType parameter difference between single-vector and multi-vector Binary quantization is intentional: single-vector storage uses `EncodedVectorsBin<u128, ...>` to enable 128-bit SIMD/popcount optimizations for longer vectors, while multi-vector storage uses `EncodedVectorsBin<u8, ...>` because multivectors are typically shorter and benefit from byte-granular storage.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
📚 Learning: 2025-08-21T13:45:05.888Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7113
File: lib/quantization/src/encoded_vectors_binary.rs:515-525
Timestamp: 2025-08-21T13:45:05.888Z
Learning: The `EncodedStorage` trait in `lib/quantization/src/encoded_storage.rs` does not have a `quantized_vector_size()` method. To validate vector sizes from storage, use `get_vector_data(index).len()` to get the actual size of stored vectors.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
📚 Learning: 2025-08-21T13:43:52.139Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7113
File: lib/quantization/src/encoded_vectors_u8.rs:202-211
Timestamp: 2025-08-21T13:43:52.139Z
Learning: The `quantized_vector_size()` method is available on the `EncodedVectors` trait, not on the `EncodedStorage` trait. The `EncodedStorage` trait provides methods like `get_vector_data`, `vectors_count`, `is_on_disk`, `upsert_vector`, `flusher`, `files`, and `immutable_files`.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
📚 Learning: 2025-08-14T11:31:21.777Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7048
File: lib/quantization/src/encoded_storage.rs:61-79
Timestamp: 2025-08-14T11:31:21.777Z
Learning: In test storage implementations (like TestEncodedStorage in lib/quantization/src/encoded_storage.rs), IvanPleshkov prefers to keep the code simple rather than adding complex overflow protection, since test storage is not used in production and can be allowed to panic on edge cases.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
📚 Learning: 2025-08-21T13:46:10.824Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7113
File: lib/quantization/src/encoded_vectors_u8.rs:202-211
Timestamp: 2025-08-21T13:46:10.824Z
Learning: In `EncodedStorage` trait implementations, calling `get_vector_data(0)` for validation purposes is not safe because the storage may be empty (vectors_count() == 0), which would cause issues when trying to access the first vector.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rslib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
📚 Learning: 2025-08-18T10:56:43.707Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs:340-347
Timestamp: 2025-08-18T10:56:43.707Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs, the create_offsets_file_from_iter function intentionally requires paths to have a parent directory and returns an error if path.parent() is None. This was a conscious design decision to ensure proper path validation.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-04-07T23:31:22.515Z
Learnt from: generall
PR: qdrant/qdrant#6323
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:12-16
Timestamp: 2025-04-07T23:31:22.515Z
Learning: The `populate()` method in `QuantizedMmapStorage` correctly returns void instead of a `Result` because it directly implements the `Madviseable` trait which defines `populate(&self)` without a return type. Adding an unnecessary `Ok(())` return would trigger Clippy warnings about unnecessary Result wrapping.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-04-07T23:31:22.515Z
Learnt from: generall
PR: qdrant/qdrant#6323
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:12-16
Timestamp: 2025-04-07T23:31:22.515Z
Learning: The `populate()` method in `QuantizedMmapStorage` does not return a `Result` because the underlying `mmap.populate()` method doesn't return a Result either. Adding an unnecessary `Ok(())` would cause Clippy to complain.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_vectors.rs
📚 Learning: 2025-03-29T17:43:28.174Z
Learnt from: generall
PR: qdrant/qdrant#6274
File: lib/segment/src/vector_storage/sparse/mmap_sparse_vector_storage.rs:233-233
Timestamp: 2025-03-29T17:43:28.174Z
Learning: In the qdrant codebase, `debug_assert!` is preferred over runtime checks that would panic in release builds to ensure production stability, even when validating conditions like vector sorting.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
📚 Learning: 2025-08-06T09:56:59.311Z
Learnt from: xzfc
PR: qdrant/qdrant#6982
File: lib/segment/src/index/hnsw_index/graph_links/view.rs:217-220
Timestamp: 2025-08-06T09:56:59.311Z
Learning: In Qdrant's GraphLinksView implementation (lib/segment/src/index/hnsw_index/graph_links/view.rs), methods like links() and links_with_vectors() intentionally use unwrap() and can panic for performance reasons, maintaining consistency across similar methods. Error handling improvements are considered secondary to feature completion and require benchmarking before implementation.
Applied to files:
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs
🧬 Code graph analysis (2)
lib/segment/src/vector_storage/quantized/quantized_vectors.rs (6)
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs (2)
is_on_disk(340-342)storage(245-247)lib/quantization/src/encoded_vectors_u8.rs (3)
is_on_disk(367-369)storage(42-44)get_quantized_vector(326-328)lib/segment/src/vector_storage/quantized/quantized_ram_storage.rs (2)
is_on_disk(61-63)build(101-118)lib/segment/src/types.rs (6)
is_on_disk(555-560)is_on_disk(1262-1271)is_on_disk(1418-1423)is_on_disk(1480-1488)is_on_disk(1891-1902)is_on_disk(2013-2018)lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs (2)
build(108-116)populate(19-21)lib/segment/src/vector_storage/chunked_mmap_vectors.rs (1)
populate(447-452)
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs (2)
lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs (10)
std(502-502)upsert_vector(394-437)vectors_count(326-328)vectors_count(439-441)flusher(41-41)flusher(107-109)flusher(173-176)flusher(199-207)flusher(443-451)new(249-261)lib/segment/src/vector_storage/chunked_vectors.rs (1)
quantization(172-223)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: lint
- GitHub Check: e2e-tests
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-consistency
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: test-consensus-compose
- GitHub Check: integration-tests
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: integration-tests-consensus
- GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
- GitHub Check: storage-compat-test
🔇 Additional comments (15)
lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs (2)
24-31: LGTM: diskness and count delegation are correct.
is_on_disk()hardcoded to true andvectors_count()delegating tolen()match the storage characteristics.
41-47: LGTM: file enumeration delegates to underlying storage.Delegating
files()/immutable_files()keeps the surface consistent with other storage backends.lib/segment/src/vector_storage/quantized/quantized_vectors.rs (13)
28-29: LGTM: imports are minimal and scoped.Only the traits actually used for chunked mmap integration are imported.
74-77: LGTM: multivector alias wiring for ScalarChunkedMmapMulti.Type parameters line up: inner encoded vectors on
ChunkedMmapVectors<u8>and offsets onChunkedMmapVectors<MultivectorOffset>.
89-92: LGTM: PQChunkedMmapMulti alias consistent with PQ single-vector types.Uses
EncodedVectorsPQ<ChunkedMmapVectors<u8>>for inner vectors and chunked offsets; matches RAM/MMAP patterns.
104-107: LGTM: BinaryChunkedMmapMulti uses u8 bits store (intentional).Matches the established pattern where multi-binary uses
u8for byte-granular benefits; single-vector binary remainsu128for SIMD.
112-128: New storage variants: parity looks complete across enum.All three single-vector and three multi-vector chunked mmap variants are added. Keep an eye on downstream exhaustiveness (scorers, populate, files), which appears handled below.
210-231: LGTM: layout retrieval covers all single-vector chunked variants.Multivector variants correctly return an error.
237-255: LGTM: get_quantized_vector dispatch extended to new chunked variants.Panic on multivector access remains intentional and unchanged.
300-343: LGTM: internal scorer dispatch covers all new variants (single and multi).Uniformly reuses the helper
buildwithEncodedVectorsbound; no gaps.
363-383: LGTM: files() now includes chunked variants.Downstream cache clearing and snapshot logic will pick these up automatically.
389-408: LGTM: immutable_files() parity for chunked variants.Matches
files()coverage; config path is appended once after the match.
1474-1506: Populate: correct use of ChunkedVectorStorage::populate with?.MMAP uses
.populate()(no Result); chunked uses trait helper returningOperationResult. Good error propagation.
1519-1539: LGTM: flusher switch extended to new variants.Consistent with existing pattern; closure mapping to
OperationErrorremains unchanged.
1548-1601: LGTM: upsert paths wired for all new variants.Dense vs. multi dispatch preserved; no accidental fall-throughs.
This PR adds implementation of quantization storage for
ChunkedMmapwith persistupsertandflushimplementation.Also this PR adds
ChunkedMmapvariant as a quantization storage.This PR does not have unit test because
ChunkedMmapdoes not have any usage andQuantizedVectorsdoes not have a create/load mechanism. This PR is mostly about compilation correctness; unit tests are the next step