Conversation
| let deleted_points_snapshot = proxy_deleted_points | ||
| .read() | ||
| .iter() | ||
| .map(|(point_id, versions)| (*point_id, *versions)) | ||
| .collect::<Vec<_>>(); |
There was a problem hiding this comment.
Instead of reading from shared structure we now read from all proxies and merge results together
| let appendable_segments_ids = segments_lock.appendable_segments_ids(); | ||
| let has_appendable_segments_except_optimized = | ||
| appendable_segments_ids.iter().any(|id| !ids.contains(id)); | ||
| let need_extra_cow_segment = !has_appendable_segments_except_optimized; |
There was a problem hiding this comment.
On optimization, we might not need extra segment
|
Report on my manual black box validations:
|
|
Found one issue running the Python client congruence tests. The error is |
| // Find appendable segments other than optimized ones | ||
| // | ||
| // If there are such segments - we can avoid creating a temp segment | ||
| // If there are none, we need to create a new empty segment to allow writes during optimization | ||
| let appendable_segments_ids = segments_lock.appendable_segments_ids(); | ||
| let has_appendable_segments_except_optimized = | ||
| appendable_segments_ids.iter().any(|id| !ids.contains(id)); | ||
| let need_extra_cow_segment = !has_appendable_segments_except_optimized; |
There was a problem hiding this comment.
Maybe we can totally skip this, and don't create any new segment here even if we don't have an appendable one.
Can't we defer creating the new segment until it is actually needed? If a new update comes in that requires an appendable segment, we create it on the fly.
This way we don't waste resources on creating segments that may not be used.
Imagine a big static collection that is created with a bulk upload. It'll likely end up having no appendable segment at all. That's great because for each search there is less segments to search through.
Thought?
As a bonus for support sessions, this gives us the flexibility to manually remove the last appendable segment from disk without panicking on restart. I ran into this at least twice already.
lib/collection/src/collection_manager/optimizers/segment_optimizer.rs
Outdated
Show resolved
Hide resolved
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
timvisee
left a comment
There was a problem hiding this comment.
I really like how this simplifies things. This makes a lot of sense, and we should have made this simplification a lot earlier. I can't even count the number of times we bumped our head against the previous version of the proxy segment. 🪨
There seem to be some minor issues left that we need to iron out, but I'm all for it.
I don't see any obvious problems with the changed tests. But I have to admit it is hard to properly track all changes there.
One thing I noticed is that removing this proxy implementation will not allow us to easily implement multiple levels of proxies in the future, each having their own mutable view. Since we don't have any kind of transactional system planned, I don't see an issue with this.
This comment was marked as resolved.
This comment was marked as resolved.
46290e7 to
2b55253
Compare
2b55253 to
842abe0
Compare
📝 WalkthroughWalkthroughThe PR restructures proxy handling and snapshot logic across collection, shard, and segment crates. SegmentOptimizer now accepts aggregated proxy segments and exposes helpers to collect proxy deletions and index changes; build/propagate/cancellation flows were updated accordingly. ProxySegment is simplified (no write segment), uses local DeletedPoints and ProxyIndexChanges, adds versioning, and disables most write operations. Snapshot APIs dropped snapshotted_segments and related short-circuiting; SegmentHolder gained swap_new_locked and remove_segment_if_not_needed and proxy_all_segments/unproxy_all_segments now use a temp segment id. TinyMap and NamedVectors APIs were broadened; delete_vectors gained an hw_counter. Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used🧬 Code graph analysis (1)tests/consensus_tests/test_partial_snapshot.py (1)
🪛 Ruff (0.13.3)tests/consensus_tests/test_partial_snapshot.py224-224: (F405) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
🔇 Additional comments (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
lib/collection/src/collection_manager/optimizers/segment_optimizer.rs (3)
566-570: Don’t unwrap delete_point; propagate errorsunwrap can panic on unexpected versions or transient errors, matching reported panics. Return errors instead.
- optimized_segment - .delete_point(versions.operation_version, point_id, hw_counter) - .unwrap(); + optimized_segment + .delete_point(versions.operation_version, point_id, hw_counter) + .map_err(|e| CollectionError::service_error(format!("delete_point failed: {e}")))?;
925-927: Avoid unwrap during finalize as wellSame panic risk; bubble up the error.
- optimized_segment - .delete_point(versions.operation_version, point_id, hw_counter) - .unwrap(); + optimized_segment + .delete_point(versions.operation_version, point_id, hw_counter) + .map_err(|e| CollectionError::service_error(format!("delete_point failed: {e}")))?;
534-572: Propagate proxied vector‐level deletes before point deletions
Vector deletes from proxies are currently ignored, causing 500s with active proxies. In bothbuild_new_segment(around lines 534–572) and the finalize path (around lines 879–929):
• ImportDeletedVectors, ProxyDeletedVectorfromshard::proxy_segment.
• After applyingproxy_index_changes, snapshotlet deleted_vectors = self.proxy_deleted_vectors(proxies);
• For each((point_id, vector_name), versions)call
optimized_segment.delete_vector(versions.operation_version, point_id, &vector_name)?;
withcheck_process_stopped(stopped)?before point deletions.This ensures proxied vector deletes are applied and unblocks the sparse vector congruence test.
♻️ Duplicate comments (5)
lib/shard/src/update.rs (1)
568-574: Fix proxy 500 on DeleteVectors: add proxy check to prevent in-place delete.The apply closure attempts to call
delete_vectoron proxy segments, causing "Delete vector is disabled for proxy segments" errors and HTTP 500 responses. This issue was flagged in a previous review and is confirmed by the failing Python client congruence test reported in the PR comments (test_simple_opt_sparse_vectors_searchfailing with "Delete vector is disabled for proxy segments: operation 16 on point 50").Apply this diff to short-circuit when the segment is a proxy, forcing the move path:
|id, write_segment| { + // Never try to mutate a proxy in-place; force move + if write_segment.is_proxy() { + return Ok(false); + } let mut res = true; for name in vector_names { res &= write_segment.delete_vector(op_num, id, name)?;This ensures the patch closure (lines 575-579) handles vector removal via the move path, consistent with the non-appendable proxy design.
Based on learnings
lib/shard/src/segment_holder/mod.rs (1)
1002-1027: Do not drop the temporary CoW segment yet.
remove_segment_if_not_neededstill drops the just-created appendable segment when it looks empty. Downstream code (e.g. the optimizer, see the panic reported in the coach/bfb runs) still assumes that segment remains present and hitstemporary CoW segment must exist. Until every caller tolerates its absence we must keep the segment around—otherwise production workloads crash mid-optimization. Please restore the old behaviour by re‑adding the segment instead of callingdrop_data().- } else { - log::trace!("Dropping temporary segment with no changes"); - tmp_segment.drop_data()?; - Ok(true) - } + } else { + log::trace!("Keeping temporary segment with no changes to satisfy optimizer invariants"); + self.add_existing_locked(segment_id, tmp_segment); + Ok(false) + }lib/shard/src/proxy_segment/segment_entry.rs (1)
227-235: Restore delete‑vector support on proxies.Turning
delete_vectorinto a hardservice_errorcauses every vector delete routed to a proxy to bubble up as HTTP 500 (see the failing congruence test and manual repro: “Delete vector is disabled for proxy segments: operation 16 on point 50”). During optimization requests still reach proxy segments before the swap completes, so rejecting them breaks client workflows. Please reinstate functional tombstoning/forwarding here rather than returning an error so deletes remain lossless while a shard is proxied.lib/shard/src/proxy_segment/mod.rs (1)
306-312: Merge should preserve newest change; don’t clobber with older entriesCurrent merge overwrites blindly; a stale change can replace a newer one across proxies.
Based on learnings
pub fn merge(&mut self, other: &Self) { for (key, change) in &other.changes { - self.changes.insert(key.clone(), change.clone()); + let replace = self + .changes + .get(key) + .map_or(true, |existing| change.version() >= existing.version()); + if replace { + self.changes.insert(key.clone(), change.clone()); + } } }lib/collection/src/collection_manager/optimizers/segment_optimizer.rs (1)
773-821: Add aggregator for proxied vector-level deletesMirror proxy_deleted_points/proxy_index_changes with a proxy_deleted_vectors aggregator so both build and finalize paths can apply these operations.
fn proxy_index_changes(&self, proxies: &[LockedSegment]) -> ProxyIndexChanges { let mut index_changes = ProxyIndexChanges::default(); for proxy_segment in proxies { match proxy_segment { LockedSegment::Original(_) => { log::error!("Reading raw segment, while proxy expected"); debug_assert!(false, "Reading raw segment, while proxy expected"); } LockedSegment::Proxy(proxy) => { let proxy_read = proxy.read(); index_changes.merge(proxy_read.get_index_changes()) } } } index_changes } + + /// Accumulates vector-level deletes made in a given set of proxies. + /// + /// Not synchronized, but includes at least all deletes before the call. + fn proxy_deleted_vectors(&self, proxies: &[LockedSegment]) -> DeletedVectors { + let mut deleted_vectors: DeletedVectors = DeletedVectors::new(); + for proxy_segment in proxies { + match proxy_segment { + LockedSegment::Original(_) => { + log::error!("Reading raw segment, while proxy expected"); + debug_assert!(false, "Reading raw segment, while proxy expected"); + } + LockedSegment::Proxy(proxy) => { + let proxy_read = proxy.read(); + for ((point_id, vector_name), versions) in proxy_read.get_deleted_vectors() { + let entry = + deleted_vectors.entry(((*point_id), vector_name.clone())).or_insert(*versions); + entry.operation_version = + entry.operation_version.max(versions.operation_version); + entry.local_version = entry.local_version.max(versions.local_version); + } + } + } + } + deleted_vectors + }Note: requires DeletedVectors/ProxyDeletedVector and ProxySegment::get_deleted_vectors (see related suggestions in proxy_segment).
🧹 Nitpick comments (1)
lib/segment/src/data_types/named_vectors.rs (1)
269-271: Consider returning the removed value.The method discards the
Option<CowVector<'a>>returned byself.map.remove(key). Standard Rust collections likeHashMapandBTreeMapreturnOption<V>from remove operations to indicate whether the key existed and provide the removed value. Returning it here would offer more flexibility to callers.Apply this diff to return the removed value:
- pub fn remove_ref(&mut self, key: &VectorName) { - self.map.remove(key); + pub fn remove_ref(&mut self, key: &VectorName) -> Option<CowVector<'a>> { + self.map.remove(key) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (20)
lib/collection/src/collection_manager/optimizers/segment_optimizer.rs(15 hunks)lib/collection/src/collection_manager/tests/mod.rs(10 hunks)lib/collection/src/shards/local_shard/snapshot.rs(3 hunks)lib/segment/src/data_types/named_vectors.rs(1 hunks)lib/segment/src/data_types/tiny_map.rs(1 hunks)lib/segment/src/entry/entry_point.rs(1 hunks)lib/segment/src/entry/snapshot_entry.rs(0 hunks)lib/segment/src/segment/entry.rs(1 hunks)lib/segment/src/segment/snapshot.rs(0 hunks)lib/segment/src/segment/tests.rs(1 hunks)lib/segment/tests/integration/segment_on_disk_snapshot.rs(2 hunks)lib/shard/src/locked_segment.rs(1 hunks)lib/shard/src/proxy_segment/mod.rs(11 hunks)lib/shard/src/proxy_segment/segment_entry.rs(26 hunks)lib/shard/src/proxy_segment/snapshot_entry.rs(1 hunks)lib/shard/src/proxy_segment/tests.rs(13 hunks)lib/shard/src/segment_holder/mod.rs(3 hunks)lib/shard/src/segment_holder/snapshot.rs(11 hunks)lib/shard/src/segment_holder/tests.rs(4 hunks)lib/shard/src/update.rs(3 hunks)
💤 Files with no reviewable changes (2)
- lib/segment/src/segment/snapshot.rs
- lib/segment/src/entry/snapshot_entry.rs
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs
📄 CodeRabbit inference engine (.github/review-rules.md)
**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead
Files:
lib/shard/src/locked_segment.rslib/segment/src/segment/tests.rslib/shard/src/segment_holder/snapshot.rslib/collection/src/shards/local_shard/snapshot.rslib/shard/src/segment_holder/mod.rslib/segment/tests/integration/segment_on_disk_snapshot.rslib/segment/src/segment/entry.rslib/collection/src/collection_manager/optimizers/segment_optimizer.rslib/segment/src/entry/entry_point.rslib/shard/src/proxy_segment/segment_entry.rslib/segment/src/data_types/named_vectors.rslib/segment/src/data_types/tiny_map.rslib/shard/src/proxy_segment/snapshot_entry.rslib/shard/src/proxy_segment/mod.rslib/shard/src/update.rslib/collection/src/collection_manager/tests/mod.rslib/shard/src/segment_holder/tests.rslib/shard/src/proxy_segment/tests.rs
**/src/**/*.rs
📄 CodeRabbit inference engine (.github/review-rules.md)
**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)
Files:
lib/shard/src/locked_segment.rslib/segment/src/segment/tests.rslib/shard/src/segment_holder/snapshot.rslib/collection/src/shards/local_shard/snapshot.rslib/shard/src/segment_holder/mod.rslib/segment/src/segment/entry.rslib/collection/src/collection_manager/optimizers/segment_optimizer.rslib/segment/src/entry/entry_point.rslib/shard/src/proxy_segment/segment_entry.rslib/segment/src/data_types/named_vectors.rslib/segment/src/data_types/tiny_map.rslib/shard/src/proxy_segment/snapshot_entry.rslib/shard/src/proxy_segment/mod.rslib/shard/src/update.rslib/collection/src/collection_manager/tests/mod.rslib/shard/src/segment_holder/tests.rslib/shard/src/proxy_segment/tests.rs
**/{tests,benches}/**/*.rs
📄 CodeRabbit inference engine (.github/review-rules.md)
Using .unwrap() and panic!() in tests and benchmarks is acceptable and should not be flagged
Files:
lib/segment/tests/integration/segment_on_disk_snapshot.rslib/collection/src/collection_manager/tests/mod.rs
🧠 Learnings (2)
📚 Learning: 2025-09-01T11:42:06.964Z
Learnt from: timvisee
PR: qdrant/qdrant#7157
File: lib/shard/src/segment_holder/mod.rs:808-814
Timestamp: 2025-09-01T11:42:06.964Z
Learning: In Qdrant's segment holder, panicking when no segments exist during flush_all is intentional and preferred over graceful error handling, as having zero segments could permanently corrupt the WAL by acknowledging u64::MAX. The maintainers consider this condition impossible and use the panic as a fail-fast safety mechanism to prevent data corruption.
Applied to files:
lib/shard/src/segment_holder/snapshot.rslib/collection/src/shards/local_shard/snapshot.rslib/shard/src/segment_holder/mod.rslib/shard/src/proxy_segment/segment_entry.rs
📚 Learning: 2025-04-20T19:08:45.034Z
Learnt from: generall
PR: qdrant/qdrant#6403
File: lib/collection/src/collection_manager/holders/proxy_segment.rs:1178-1197
Timestamp: 2025-04-20T19:08:45.034Z
Learning: In the ProxySegment implementation, overwriting previous index changes when inserting a DeleteIfIncompatible operation is intentional. When an incompatible index is detected, the DeleteIfIncompatible operation should take precedence over any previously scheduled operations on that index.
Applied to files:
lib/collection/src/collection_manager/optimizers/segment_optimizer.rslib/shard/src/proxy_segment/mod.rs
🧬 Code graph analysis (11)
lib/shard/src/segment_holder/snapshot.rs (2)
lib/shard/src/proxy_segment/mod.rs (1)
new(43-69)lib/shard/src/segment_holder/mod.rs (1)
get(258-262)
lib/collection/src/shards/local_shard/snapshot.rs (1)
lib/shard/src/segment_holder/snapshot.rs (2)
proxy_all_segments(35-110)unproxy_all_segments(173-232)
lib/shard/src/segment_holder/mod.rs (3)
lib/shard/src/locked_segment.rs (1)
get(53-58)lib/shard/src/proxy_segment/segment_entry.rs (1)
available_point_count(500-504)lib/segment/src/segment/entry.rs (1)
available_point_count(459-461)
lib/segment/src/segment/entry.rs (2)
lib/shard/src/proxy_segment/segment_entry.rs (1)
is_proxy(29-31)lib/segment/src/entry/entry_point.rs (1)
is_proxy(34-34)
lib/collection/src/collection_manager/optimizers/segment_optimizer.rs (2)
lib/shard/src/segment_holder/mod.rs (2)
appendable_segments_ids(285-287)segments(1080-1083)lib/shard/src/proxy_segment/mod.rs (1)
new(43-69)
lib/segment/src/entry/entry_point.rs (2)
lib/shard/src/proxy_segment/segment_entry.rs (1)
is_proxy(29-31)lib/segment/src/segment/entry.rs (1)
is_proxy(42-44)
lib/shard/src/proxy_segment/segment_entry.rs (4)
lib/segment/src/entry/entry_point.rs (20)
is_proxy(34-34)point_version(39-39)upsert_point(63-69)delete_point(71-76)update_vectors(78-84)delete_vector(86-91)set_full_payload(102-108)set_payload(93-100)delete_payload(110-116)clear_payload(118-123)read_range(187-187)is_empty(227-227)available_point_count(232-232)deleted_point_count(235-235)flush(269-269)version(32-32)build_field_index(293-299)apply_field_index(302-308)vector_names(218-218)vector_names(242-245)lib/segment/src/segment/entry.rs (19)
is_proxy(42-44)point_version(46-51)upsert_point(110-131)delete_point(133-170)update_vectors(172-193)delete_vector(195-227)set_full_payload(229-252)set_payload(254-279)delete_payload(281-303)clear_payload(305-326)read_range(442-449)is_empty(455-457)available_point_count(459-461)deleted_point_count(463-465)flush(622-756)version(38-40)build_field_index(805-836)apply_field_index(838-857)vector_names(867-869)lib/shard/src/proxy_segment/mod.rs (3)
add_deleted_points_condition_to_filter(130-153)is_empty(283-285)version(321-327)lib/shard/src/locked_segment.rs (2)
from(94-96)from(100-102)
lib/shard/src/proxy_segment/mod.rs (3)
lib/shard/src/proxy_segment/segment_entry.rs (1)
version(25-27)lib/segment/src/entry/entry_point.rs (1)
version(32-32)lib/segment/src/segment/entry.rs (1)
version(38-40)
lib/collection/src/collection_manager/tests/mod.rs (4)
lib/shard/src/proxy_segment/tests.rs (1)
wrap_proxy(209-211)lib/shard/src/proxy_segment/mod.rs (1)
new(43-69)lib/shard/src/locked_segment.rs (2)
new(45-50)get(53-58)lib/shard/src/proxy_segment/segment_entry.rs (1)
has_point(491-494)
lib/shard/src/segment_holder/tests.rs (4)
lib/shard/src/proxy_segment/segment_entry.rs (2)
has_point(491-494)payload(329-342)lib/segment/src/entry/entry_point.rs (2)
has_point(209-209)payload(140-144)lib/segment/src/segment/entry.rs (2)
has_point(451-453)payload(354-361)lib/shard/src/segment_holder/mod.rs (1)
get(258-262)
lib/shard/src/proxy_segment/tests.rs (4)
lib/shard/src/proxy_segment/mod.rs (1)
new(43-69)lib/shard/src/fixtures.rs (1)
build_segment_1(71-121)lib/segment/src/data_types/vectors.rs (1)
only_default_vector(527-529)lib/collection/src/collection_manager/tests/mod.rs (1)
wrap_proxy(29-38)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: e2e-tests
- GitHub Check: test-consistency
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: integration-tests-consensus
- GitHub Check: integration-tests
- GitHub Check: storage-compat-test
- GitHub Check: lint
- GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (ubuntu-latest)
🔇 Additional comments (15)
lib/segment/src/data_types/tiny_map.rs (1)
105-118: LGTM! API consistency improvement.The updated
removemethod signature now matches the flexible borrowed-key pattern used byget,get_mut, andcontains_key. This allows removal using borrowed types (e.g.,&strforStringkeys) while maintaining backward compatibility.lib/shard/src/update.rs (2)
72-78: LGTM!The
hw_counterparameter addition to thedelete_vectorscall is consistent with the updated function signature and aligns with the pattern used for other operations in this file.
599-600: LGTM!The
hw_counterparameter propagation is correct and maintains consistency with the updateddelete_vectorssignature.lib/collection/src/shards/local_shard/snapshot.rs (2)
230-230: LGTM! Snapshot API simplification.The removal of the
snapshotted_segmentsparameter aligns with the PR's objective to simplify the proxy architecture by removing per-segment snapshot tracking.
275-275: LGTM! Consistent ID-based segment tracking.The rename from
tmp_segmenttotmp_segment_idcorrectly reflects thatproxy_all_segmentsnow returns a segment ID rather than the segment itself, and the subsequentunproxy_all_segmentscall is updated accordingly.Also applies to: 341-341
lib/shard/src/segment_holder/tests.rs (2)
530-532: LGTM! Proper LockedSegment usage.The test now correctly creates a
LockedSegmentfrom the segment and usesadd_new_lockedto add it to the holder, which aligns with the new public API.
576-631: LGTM! Test refactor aligns with non-appendable proxy model.The refactored test correctly validates the new proxy behavior:
- Lines 576-600: Properly tests point accessibility and deletion through double proxy layers
- Lines 620-622: Correctly expects zero new segments after unproxying (non-appendable proxies don't create extra segments)
- Lines 624-630: Validates point deletion propagation through proxy layers
Based on learnings: The maintainers confirmed the shared write segment behavior no longer exists.
lib/segment/src/segment/entry.rs (1)
42-44: LGTM! Correct trait implementation.The
is_proxy()method correctly returnsfalsefor regularSegmentimplementations, properly implementing the newSegmentEntrytrait requirement.lib/shard/src/locked_segment.rs (1)
60-65: LGTM! Useful convenience method.The
is_original()helper simplifies checking whether aLockedSegmentis theOriginalvariant without requiring pattern matching at call sites.lib/segment/src/entry/entry_point.rs (1)
34-34: LGTM! Trait extension for proxy detection.The
is_proxy()method extends theSegmentEntrytrait to support proxy segment detection, enabling implementations to report whether they are proxy wrappers. This is properly implemented across bothSegmentandProxySegmenttypes.lib/segment/tests/integration/segment_on_disk_snapshot.rs (1)
1-1: LGTM! Test updated for simplified snapshot API.The removal of
HashSetimport and the updatedtake_snapshotcall signature align with the snapshot API simplification that removes thesnapshotted_segmentstracking parameter.Also applies to: 168-168
lib/segment/src/segment/tests.rs (1)
211-211: LGTM! Test updated for simplified snapshot API.The
take_snapshotcall signature change removes thesnapshotted_segmentsparameter, consistent with the snapshot API simplification across the codebase.lib/shard/src/proxy_segment/snapshot_entry.rs (1)
22-25: LGTM! Simplified proxy snapshot delegation.The snapshot implementation now correctly delegates entirely to the
wrapped_segment, consistent with the PR's objective to make proxy segments non-appendable and remove the separate write segment snapshot path.lib/collection/src/collection_manager/optimizers/segment_optimizer.rs (1)
614-622: Good: defer creating extra appendable segment until neededThis reduces unnecessary segments for static collections and aligns with prior feedback.
lib/shard/src/proxy_segment/mod.rs (1)
43-69: LGTM: simplified ProxySegment and index replicationConstructor, removal of write_segment, and replicate_field_indexes signature look good. propagate_to_wrapped applies index changes before deletes, with correct lock upgrades.
Also applies to: 73-112, 165-241
| pub type DeletedPoints = AHashMap<PointIdType, ProxyDeletedPoint>; | ||
|
|
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
Track and expose vector-level deletes in ProxySegment
To propagate delete_vector ops during optimization, ProxySegment should record them similar to point deletes.
- pub type DeletedPoints = AHashMap<PointIdType, ProxyDeletedPoint>;
+ pub type DeletedPoints = AHashMap<PointIdType, ProxyDeletedPoint>;
+ pub type DeletedVectors = AHashMap<(PointIdType, PayloadKeyType), ProxyDeletedVector>; deleted_mask: Option<BitVec>,
changed_indexes: ProxyIndexChanges,
/// Points which should no longer used from wrapped_segment
/// May contain points which are not in wrapped_segment,
/// because the set is shared among all proxy segments
- deleted_points: DeletedPoints,
+ deleted_points: DeletedPoints,
+ /// Vector-level deletions (point_id, vector_name) scheduled against the wrapped segment
+ deleted_vectors: DeletedVectors,
wrapped_config: SegmentConfig,
/// Version of the last change in this proxy, considering point deletes and payload index
/// changes. Defaults to the version of the wrapped segment.
version: SeqNumberType, ProxySegment {
wrapped_segment: segment,
deleted_mask,
changed_indexes: ProxyIndexChanges::default(),
- deleted_points: AHashMap::new(),
+ deleted_points: AHashMap::new(),
+ deleted_vectors: AHashMap::new(),
wrapped_config,
version,
} pub fn get_deleted_points(&self) -> &DeletedPoints {
&self.deleted_points
}
-pub fn get_index_changes(&self) -> &ProxyIndexChanges {
+pub fn get_index_changes(&self) -> &ProxyIndexChanges {
&self.changed_indexes
}
+
+pub fn get_deleted_vectors(&self) -> &DeletedVectors {
+ &self.deleted_vectors
+}Additionally, add:
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub struct ProxyDeletedVector {
pub local_version: SeqNumberType,
pub operation_version: SeqNumberType,
}And ensure delete_vector scheduling (wherever updates are handled) writes into deleted_vectors for proxies. This enables the optimizer to propagate these ops.
Also applies to: 30-40, 61-69, 243-250
🏁 Script executed:
#!/bin/bash
set -e
# Show where delete_vector is rejected for proxies (to wire scheduling instead of hard error)
rg -nP --type=rust -C3 'Delete vector is disabled for proxy segments'Length of output: 623
Track and expose vector-level deletes in ProxySegment
- Define
pub type DeletedVectors = AHashMap<(PointIdType, PayloadKeyType), ProxyDeletedVector>;and#[derive(Clone, Copy, Debug, PartialEq, Eq)] pub struct ProxyDeletedVector { pub local_version: SeqNumberType, pub operation_version: SeqNumberType, }
- Add
deleted_vectors: DeletedVectorstoProxySegment, initialize innew(), and expose via
pub fn get_deleted_vectors(&self) -> &DeletedVectors - In
lib/shard/src/proxy_segment/segment_entry.rs, replace the hard error in thedelete_vectorhandler with scheduling intodeleted_vectors:- Err(OperationError::service_error(format!( - "Delete vector is disabled for proxy segments: operation {op_num} on point {point_id}" - ))) + self.deleted_vectors.insert( + (point_id, payload_key), + ProxyDeletedVector { local_version, operation_version } + ); + Ok(true)
🤖 Prompt for AI Agents
In lib/shard/src/proxy_segment/mod.rs around lines 17-18, add support for
vector-level deletes by defining pub type DeletedVectors =
AHashMap<(PointIdType, PayloadKeyType), ProxyDeletedVector> and the struct
ProxyDeletedVector with fields local_version: SeqNumberType and
operation_version: SeqNumberType (derive Clone, Copy, Debug, PartialEq, Eq); add
a new field deleted_vectors: DeletedVectors to the ProxySegment struct,
initialize it in ProxySegment::new(), and expose it with pub fn
get_deleted_vectors(&self) -> &DeletedVectors; then update
lib/shard/src/proxy_segment/segment_entry.rs to stop returning a hard error in
the delete_vector handler and instead schedule the deletion by inserting a
ProxyDeletedVector into segment.deleted_vectors (using appropriate keys
(PointIdType, PayloadKeyType) and versions) so vector deletes are tracked
consistently.
* WIP: make proxy segment non-appendable * make deleted points and index changes private to Proxy Segment * consistently remove empty unused segment from holder on unproxifying and the end of optimization * remove/fix obsolete test * Fix comment, some general adjustments * Also don't remove non-original segments * Use ahash for map of proxy index changes * Inline format arguments * Simplify has point branching * Assert that proxy flushes to latest wrapped segment version * make vector deletetion a CoW operation * relax debug assertions * Update lib/shard/src/proxy_segment/mod.rs * fix test_partial_snapshot_empty --------- Co-authored-by: timvisee <tim@visee.me> Co-authored-by: Tim Visée <tim+github@visee.me>
Current architecture assumes, that proxy segments are appendable by default and do internal copy-on-write into write segment whenever somehing needs to be changed. This architecture have acient roots and there are no clear reasons why it was designed like this in the first place, but the whole engine was designed around it and it kinda worked, untill we tried to integrate snapshots, which required double proxies. That's where it stated to fall apart with no clear way of reparing. So why appendable proxies are bad?
This PR does: