Conversation
a3d2681 to
6663b69
Compare
7ada4e5 to
31f5893
Compare
|
|
||
| pub fn wipe(self) -> OperationResult<()> { | ||
| let files = self.files(); | ||
| let Self { path, .. } = self; |
There was a problem hiding this comment.
the deconstruction is semantically equivalent but feels a bit hacky compared to an explicit drop
| // Wait for all background flush operations to finish, and cancel future flushes | ||
| _ = self.flushing_lock.lock(); |
There was a problem hiding this comment.
I don't think this works.
When we obtain this lock, the Arcs are still alive. And so a flusher thread is still able to upgrade, lock and flush after this drop function if the timing is right.
There was a problem hiding this comment.
Thanks for catching this, I will revert all usages for having a lock of a bool to signal properly that the drop occurred
31f5893 to
7ebd324
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
lib/gridstore/src/gridstore.rs(1 hunks)lib/segment/src/common/flags/buffered_dynamic_flags.rs(3 hunks)lib/segment/src/common/mmap_bitslice_buffered_update_wrapper.rs(3 hunks)lib/segment/src/index/field_index/bool_index/mutable_bool_index.rs(1 hunks)lib/segment/src/index/field_index/full_text_index/mmap_text_index.rs(1 hunks)lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs(1 hunks)lib/segment/src/index/field_index/map_index/mmap_map_index.rs(1 hunks)lib/segment/src/index/field_index/numeric_index/mmap_numeric_index.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (.github/review-rules.md)
**/*.rs: Prefer explicitSomeType::from(x)over implicitx.into()in Rust
In new code, don't usetransmute_from_u8,transmute_to_u8,transmute_from_u8_to_slice,transmute_from_u8_to_mut_slice,transmute_to_u8_slice- usebytemuckorzerocopycrates instead
Prefer explicit exhaustive matches over catch-all_arm in Rust match expressions, so we don't miss handling new enum variants when they are added later (exceptions: tests and benchmarks, or when sure new variants won't affect logic)
Prefer explicit field ignoring using: _over using..in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)
Files:
lib/segment/src/index/field_index/full_text_index/mmap_text_index.rslib/segment/src/common/flags/buffered_dynamic_flags.rslib/segment/src/common/mmap_bitslice_buffered_update_wrapper.rslib/gridstore/src/gridstore.rslib/segment/src/index/field_index/numeric_index/mmap_numeric_index.rslib/segment/src/index/field_index/geo_index/mmap_geo_index.rslib/segment/src/index/field_index/map_index/mmap_map_index.rslib/segment/src/index/field_index/bool_index/mutable_bool_index.rs
🧠 Learnings (8)
📓 Common learnings
Learnt from: JojiiOfficial
Repo: qdrant/qdrant PR: 5951
File: lib/api/src/grpc/qdrant.rs:6493-6496
Timestamp: 2025-03-12T09:25:16.340Z
Learning: The `payload_index_io_read` field in `HardwareUsage` was introduced in PR #5951 and is safe to modify as it was not used in production before this PR.
📚 Learning: 2025-08-15T11:41:10.629Z
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7043
File: lib/segment/src/vector_storage/quantized/quantized_vectors.rs:857-857
Timestamp: 2025-08-15T11:41:10.629Z
Learning: In `lib/segment/src/vector_storage/quantized/quantized_vectors.rs`, the multivector offset storage has two different patterns: for RAM storage, offsets are collected into Vec<MultivectorOffset> and used directly; for MMAP storage, offsets are consumed to create a file via create_offsets_file_from_iter, then the file is loaded back as MultivectorOffsetsStorageMmap. The direct consumption of offsets iterator in the MMAP case is intentional.
Applied to files:
lib/segment/src/index/field_index/full_text_index/mmap_text_index.rslib/segment/src/index/field_index/numeric_index/mmap_numeric_index.rslib/segment/src/index/field_index/map_index/mmap_map_index.rs
📚 Learning: 2025-09-01T11:42:06.964Z
Learnt from: timvisee
Repo: qdrant/qdrant PR: 7157
File: lib/shard/src/segment_holder/mod.rs:808-814
Timestamp: 2025-09-01T11:42:06.964Z
Learning: In Qdrant's segment holder, panicking when no segments exist during flush_all is intentional and preferred over graceful error handling, as having zero segments could permanently corrupt the WAL by acknowledging u64::MAX. The maintainers consider this condition impossible and use the panic as a fail-fast safety mechanism to prevent data corruption.
Applied to files:
lib/segment/src/common/flags/buffered_dynamic_flags.rs
📚 Learning: 2025-10-13T09:34:22.740Z
Learnt from: timvisee
Repo: qdrant/qdrant PR: 7388
File: lib/shard/src/segment_holder/flush.rs:73-76
Timestamp: 2025-10-13T09:34:22.740Z
Learning: In Qdrant's SegmentHolder flush logic, when `sync=false` and a background flush is already running, returning early is acceptable because: (1) flushes normally only start if there are changes, (2) the lowest version number is acknowledged in the WAL when changes exist, and (3) missed operations will replay on restart. However, an edge case exists when the `force` flag is used: if the first forced flush has no changes but a follow-up non-force flush with changes is ignored, the system might acknowledge a version that is too recent.
Applied to files:
lib/segment/src/common/flags/buffered_dynamic_flags.rs
📚 Learning: 2025-08-15T11:42:00.297Z
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.
Applied to files:
lib/segment/src/common/mmap_bitslice_buffered_update_wrapper.rslib/segment/src/index/field_index/numeric_index/mmap_numeric_index.rs
📚 Learning: 2025-06-06T07:52:38.478Z
Learnt from: timvisee
Repo: qdrant/qdrant PR: 6644
File: lib/gridstore/src/blob.rs:41-57
Timestamp: 2025-06-06T07:52:38.478Z
Learning: In the Blob trait implementations for gridstore (lib/gridstore/src/blob.rs), the maintainer prefers using assert! for input validation over Result-based error handling, as they don't expect invalid inputs in real scenarios and changing to Result would require updating the entire trait interface.
Applied to files:
lib/gridstore/src/gridstore.rs
📚 Learning: 2025-04-17T10:52:59.516Z
Learnt from: ffuugoo
Repo: qdrant/qdrant PR: 6352
File: lib/segment/src/id_tracker/mutable_id_tracker.rs:289-299
Timestamp: 2025-04-17T10:52:59.516Z
Learning: In Qdrant's MutableIdTracker, FileVersionTracker flushers are designed to capture the pending version at creation time (not at flush time) to ensure they remain in sync with pending_mappings flusher. This design ensures consistency between data and version information during flush operations.
Applied to files:
lib/gridstore/src/gridstore.rs
📚 Learning: 2025-09-01T11:19:26.371Z
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7193
File: lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs:17-30
Timestamp: 2025-09-01T11:19:26.371Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_chunked_mmap_storage.rs, the ChunkedMmapVectors underlying implementation does not validate against zero-sized vectors, so adding such validation in QuantizedChunkedMmapStorage::new is unnecessary and would be redundant.
Applied to files:
lib/segment/src/index/field_index/numeric_index/mmap_numeric_index.rs
🧬 Code graph analysis (4)
lib/segment/src/index/field_index/full_text_index/mmap_text_index.rs (4)
lib/gridstore/src/bitmask/gaps.rs (1)
path(90-92)lib/segment/src/fixtures/payload_context_fixture.rs (1)
drop(86-89)lib/segment/src/segment/mod.rs (1)
drop(109-143)lib/common/memory/src/fadvise.rs (1)
drop(98-104)
lib/segment/src/common/flags/buffered_dynamic_flags.rs (1)
lib/segment/src/common/mmap_bitslice_buffered_update_wrapper.rs (1)
new(23-32)
lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs (2)
lib/segment/src/index/field_index/full_text_index/mmap_text_index.rs (1)
path(53-55)lib/gridstore/src/bitmask/gaps.rs (1)
path(90-92)
lib/segment/src/index/field_index/bool_index/mutable_bool_index.rs (1)
lib/segment/src/common/flags/buffered_dynamic_flags.rs (1)
drop(97-100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: e2e-tests
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-consensus-compose
- GitHub Check: test-consistency
- GitHub Check: integration-tests
- GitHub Check: integration-tests-consensus
- GitHub Check: lint
- GitHub Check: Build Qdrant Edge Python bindings
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: rust-tests (macos-latest)
🔇 Additional comments (10)
lib/gridstore/src/gridstore.rs (1)
576-579: LGTM!The updated comment clearly explains the purpose of using weak references in the flusher pattern—detecting if the
Gridstoreinstance has been dropped to abort the flush gracefully.lib/segment/src/common/mmap_bitslice_buffered_update_wrapper.rs (2)
18-31: LGTM on the flushing_lock field and initialization.The weak reference pattern combined with the flushing lock provides proper lifecycle management for concurrent flushes.
73-101: LGTM on the flusher implementation.The flusher correctly:
- Uses weak references to detect if the instance has been dropped
- Acquires the flushing lock with
let _flushing_lock_guard = flushing_lock.lock()which keeps the guard alive for the duration of the flushlib/segment/src/common/flags/buffered_dynamic_flags.rs (2)
24-35: LGTM on the flushing_lock field and initialization.Consistent with the pattern used in
MmapBitSliceBufferedUpdateWrapper.
52-93: LGTM on the flusher implementation.The flusher correctly uses
let _flushing_lock_guard = flushing_lock.lock()to keep the guard alive during the flush operation.lib/segment/src/index/field_index/numeric_index/mmap_numeric_index.rs (1)
200-210: LGTM on the wipe() implementation.The updated sequence correctly ensures mmap handles are released before file deletion:
- Collect files and clone path while
selfis valid- Drop
selfto release mmap resources- Delete files and directory
This prevents potential file locking issues (especially on Windows) and aligns with the PR's objective of ensuring proper resource cleanup order.
lib/segment/src/index/field_index/full_text_index/mmap_text_index.rs (1)
57-67: LGTM on the wipe() implementation.Consistent with the pattern applied across other mmap index implementations in this PR. The explicit
drop(self)before file deletion ensures mmap handles are properly released.lib/segment/src/index/field_index/map_index/mmap_map_index.rs (1)
153-163: LGTM! Correct mmap handle cleanup before file deletion.The pattern is correctly implemented:
files()is captured whileselfis still valid, then path is cloned,selfis dropped to release mmap handles, and finally files are deleted. This ensures proper resource cleanup order.lib/segment/src/index/field_index/bool_index/mutable_bool_index.rs (1)
310-319: LGTM! Correct mmap handle cleanup before directory deletion.The pattern correctly ensures mmap handles are released before filesystem operations. The
Dropimplementation inBufferedDynamicFlags(seen in relevant snippets) will wait for any in-flight flush operations to complete before the directory is removed.lib/segment/src/index/field_index/geo_index/mmap_geo_index.rs (1)
343-353: LGTM! Correct mmap handle cleanup before file deletion.The explicit
drop(self)approach is cleaner than deconstruction (addressing the prior review feedback). The pattern is consistent with other index implementations in this PR: capture files, clone path, drop self to release mmap handles, then perform filesystem operations.
This reverts commit 7ebd324.
| let (Some(bitslice), Some(pending_updates_arc)) = | ||
| (bitslice.upgrade(), pending_updates_arc.upgrade()) | ||
| else { | ||
| log::debug!( | ||
| "Aborted flushing on a dropped MmapBitSliceBufferedUpdateWrapper instance" | ||
| ); | ||
| return Ok(()); | ||
| }; |
There was a problem hiding this comment.
I like that we still use weak refs.
It allows us to more reliably drop the Arc'ed structures together with the structure itself. Rather than having to wait on the flushers to 'recognize' the stop too.
* Fix release mmap handles before removing files * weak references to handle after drop * protect from concurrent drop/flush * simplify locking * Revert "simplify locking" This reverts commit 7ebd324.
Follow up on #7621
I performed an audit to make sure that other types of payload index also:
The
MutableBoolIndexusesBufferedDynamicFlagswhich flushes properly since #7621All the others indexes modified are built on top of
MmapBitSliceBufferedUpdateWrapperwhich this PR fixes.I also implemented the following previous suggestions:
simplify locking for drop with weak reference