Skip to content

restore snapshot in edge#7852

Merged
generall merged 10 commits intodevfrom
restore-snapshot-in-edge
Jan 5, 2026
Merged

restore snapshot in edge#7852
generall merged 10 commits intodevfrom
restore-snapshot-in-edge

Conversation

@generall
Copy link
Member

@generall generall commented Jan 3, 2026

  • Restore shard snapshot in Edge python bindings

Main challenge of this PR is to make file operations universal in sync and async versions of the code.
To achieve this, I introduce two extra structures, which are intended to force both implementations to use all fields (files) mentioned in those strucutres

@generall generall requested review from Anush008 and ffuugoo January 3, 2026 00:07
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 3, 2026

📝 Walkthrough

Walkthrough

This pull request refactors shard snapshot and file path management across the Qdrant codebase. It introduces a centralized shard data file utilities module (shard::files) that consolidates path constants and operations for WAL, segments, and clock files. It moves snapshot manifest handling from the segment crate to a new shard::snapshots module, introducing a SnapshotManifest struct with RecoveryType variants and utilities for unpacking, restoring, and planning partial snapshot merges. The segment snapshot API is updated to work with per-segment manifests instead of aggregate snapshots. Local shard file operations are refactored to use the new utilities. Edge-Python bindings are extended with snapshot operations including unpacking, manifest retrieval, and partial snapshot updates. Cross-filesystem-safe move utilities are added to the common/io crate. Import paths throughout the codebase are updated to reference types from their new locations in the shard crate.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • ffuugoo
  • timvisee

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 68.83% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'restore snapshot in edge' clearly refers to the main objective of the PR: adding snapshot restoration functionality to the Edge Python bindings.
Description check ✅ Passed The description accurately relates to the changeset, explaining the main goal of restoring shard snapshots in Edge Python bindings and the approach of introducing structures for universal file operations.
✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
lib/edge/src/lib.rs (1)

34-35: Consider using centralized path constants from shard::files.

The PR refactors path constants to be centralized in shard::files. For consistency with changes in lib/collection/src/shards/local_shard/mod.rs (line 50) and lib/collection/src/shards/local_shard/snapshot.rs (line 13), consider importing WAL_PATH and SEGMENTS_PATH from shard::files rather than defining them locally.

🔎 Proposed refactor
+use shard::files::{WAL_PATH, SEGMENTS_PATH};
+
 #[derive(Debug)]
 pub struct Shard {
     _path: PathBuf,
@@ -31,9 +32,6 @@
     segments: LockedSegmentHolder,
 }
 
-const WAL_PATH: &str = "wal";
-const SEGMENTS_PATH: &str = "segments";
-
 impl Shard {
lib/edge/python/examples/restore-snapshot.py (2)

7-7: Consider explicit imports for better code clarity.

The wildcard import makes it less clear which symbols come from qdrant_edge. For better maintainability, consider importing only the required names:

-from qdrant_edge import *
+from qdrant_edge import Shard

32-36: Consider adding path validation before directory removal.

While the current implementation is safe with controlled inputs, adding a basic sanity check before shutil.rmtree would make the script more defensive:

recovered_path = os.path.join(DATA_DIRECTORY, "restored_shard")
print(f"Restoring shard from snapshot to: {recovered_path}")
if os.path.exists(recovered_path):
    # Sanity check: ensure we're removing a subdirectory of DATA_DIRECTORY
    if not recovered_path.startswith(os.path.abspath(DATA_DIRECTORY)):
        raise ValueError(f"Invalid recovery path: {recovered_path}")
    print("Removing existing recovered shard directory...")
    shutil.rmtree(recovered_path)
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c0b70f1 and 11821cb.

📒 Files selected for processing (11)
  • lib/collection/src/shards/local_shard/mod.rs
  • lib/collection/src/shards/local_shard/snapshot.rs
  • lib/edge/python/examples/common.py
  • lib/edge/python/examples/restore-snapshot.py
  • lib/edge/python/src/lib.rs
  • lib/edge/src/lib.rs
  • lib/edge/src/snapshots.rs
  • lib/shard/Cargo.toml
  • lib/shard/src/files/mod.rs
  • lib/shard/src/lib.rs
  • lib/shard/src/segment_holder/snapshot.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust
In new code, don't use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice - use bytemuck or zerocopy crates instead
Prefer explicit exhaustive matches over catch-all _ arm in Rust match expressions, so we don't miss handling new enum variants when they are added later (exceptions: tests and benchmarks, or when sure new variants won't affect logic)
Prefer explicit field ignoring using : _ over using .. in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)

Files:

  • lib/edge/python/src/lib.rs
  • lib/shard/src/lib.rs
  • lib/collection/src/shards/local_shard/snapshot.rs
  • lib/edge/src/snapshots.rs
  • lib/edge/src/lib.rs
  • lib/shard/src/segment_holder/snapshot.rs
  • lib/shard/src/files/mod.rs
  • lib/collection/src/shards/local_shard/mod.rs
🧠 Learnings (5)
📚 Learning: 2025-03-03T15:54:45.553Z
Learnt from: generall
Repo: qdrant/qdrant PR: 6088
File: lib/segment/src/index/field_index/mod.rs:18-18
Timestamp: 2025-03-03T15:54:45.553Z
Learning: In the Qdrant codebase, modules can be organized as directories with their own `mod.rs` file, rather than single files, following standard Rust module organization patterns.

Applied to files:

  • lib/shard/src/lib.rs
📚 Learning: 2025-10-13T22:58:03.121Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7400
File: lib/segment/src/id_tracker/simple_id_tracker.rs:234-241
Timestamp: 2025-10-13T22:58:03.121Z
Learning: SimpleIdTracker in lib/segment/src/id_tracker/simple_id_tracker.rs is being deprecated and should not receive fixes related to version tracking or recovery logic, as it has a different version storage structure that is incompatible with newer trackers.

Applied to files:

  • lib/collection/src/shards/local_shard/snapshot.rs
  • lib/shard/src/segment_holder/snapshot.rs
📚 Learning: 2025-09-01T11:42:06.964Z
Learnt from: timvisee
Repo: qdrant/qdrant PR: 7157
File: lib/shard/src/segment_holder/mod.rs:808-814
Timestamp: 2025-09-01T11:42:06.964Z
Learning: In Qdrant's segment holder, panicking when no segments exist during flush_all is intentional and preferred over graceful error handling, as having zero segments could permanently corrupt the WAL by acknowledging u64::MAX. The maintainers consider this condition impossible and use the panic as a fail-fast safety mechanism to prevent data corruption.

Applied to files:

  • lib/collection/src/shards/local_shard/snapshot.rs
  • lib/shard/src/segment_holder/snapshot.rs
📚 Learning: 2025-08-15T11:42:00.297Z
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.

Applied to files:

  • lib/shard/src/segment_holder/snapshot.rs
📚 Learning: 2025-08-10T18:30:02.986Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.

Applied to files:

  • lib/collection/src/shards/local_shard/mod.rs
🧬 Code graph analysis (7)
lib/edge/python/src/lib.rs (2)
lib/edge/src/snapshots.rs (1)
  • unpack_snapshot (9-11)
lib/shard/src/segment_holder/snapshot.rs (1)
  • unpack_snapshot (40-49)
lib/collection/src/shards/local_shard/snapshot.rs (2)
lib/segment/src/segment/snapshot.rs (1)
  • files (211-229)
lib/shard/src/segment_holder/snapshot.rs (1)
  • restore_unpacked_snapshot (54-78)
lib/edge/src/snapshots.rs (2)
lib/shard/src/segment_holder/snapshot.rs (1)
  • unpack_snapshot (40-49)
lib/edge/python/src/lib.rs (1)
  • unpack_snapshot (121-124)
lib/shard/src/segment_holder/snapshot.rs (3)
lib/segment/src/common/validate_snapshot_archive.rs (2)
  • validate_snapshot_archive (15-48)
  • open_snapshot_archive (50-65)
lib/segment/src/segment/snapshot.rs (1)
  • files (211-229)
lib/shard/src/files/mod.rs (1)
  • segments_path (9-11)
lib/shard/src/files/mod.rs (1)
lib/collection/src/shards/local_shard/mod.rs (2)
  • segments_path (470-472)
  • shard_path (462-464)
lib/collection/src/shards/local_shard/mod.rs (1)
lib/shard/src/files/mod.rs (1)
  • segments_path (9-11)
lib/edge/python/examples/restore-snapshot.py (3)
lib/edge/src/snapshots.rs (1)
  • unpack_snapshot (9-11)
lib/shard/src/segment_holder/snapshot.rs (1)
  • unpack_snapshot (40-49)
lib/edge/python/src/lib.rs (2)
  • unpack_snapshot (121-124)
  • retrieve (102-116)
🪛 Ruff (0.14.10)
lib/edge/python/examples/restore-snapshot.py

7-7: from qdrant_edge import * used; unable to detect undefined names

(F403)


22-22: Probable use of requests call without timeout

(S113)


38-38: Shard may be undefined, or defined from star imports

(F405)


40-40: Shard may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Test Python bindings
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-consistency
  • GitHub Check: e2e-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: lint
🔇 Additional comments (13)
lib/shard/Cargo.toml (1)

44-44: LGTM!

Adding fs-err to dependencies is appropriate for better filesystem error reporting in the new snapshot restoration functionality.

lib/shard/src/lib.rs (1)

1-1: LGTM!

Exposing the files module publicly enables other crates to use the centralized path constants and helpers, supporting better code organization.

lib/edge/python/examples/common.py (1)

7-7: LGTM!

Defining DATA_DIRECTORY at module level improves reusability across example scripts and supports the new snapshot restoration workflow.

lib/shard/src/files/mod.rs (1)

1-11: LGTM!

This module effectively centralizes shard path constants that were previously duplicated across the codebase. The segments_path helper provides a clean API for path construction, and the #[inline] attribute is appropriate for this trivial function.

lib/shard/src/segment_holder/snapshot.rs (1)

36-49: LGTM!

The unpack_snapshot function correctly implements the two-step restoration flow: unpacking the archive and then restoring segment internals. Error handling is appropriate, and the use of open_snapshot_archive ensures safe handling with validation and overwrite protection.

lib/edge/src/lib.rs (1)

5-5: LGTM!

The private module declaration appropriately integrates the new snapshot functionality into the edge crate.

lib/edge/python/src/lib.rs (1)

117-124: LGTM!

The Python binding correctly exposes the snapshot unpacking functionality. The implementation properly delegates to the underlying Rust API with appropriate type conversions and error handling.

lib/collection/src/shards/local_shard/snapshot.rs (2)

13-13: LGTM!

The import changes align with the centralization of path constants into shard::files, improving maintainability.


38-42: Verify the snapshot path semantics.

The function delegates to SegmentHolder::restore_unpacked_snapshot, which suggests it expects an already-unpacked snapshot directory. However, the function name is restore_snapshot and the parameter is snapshot_path, which could be ambiguous. Ensure the documentation clarifies whether snapshot_path refers to:

  • An unpacked snapshot directory (if so, consider renaming the parameter to unpacked_snapshot_path)
  • An archive file that needs unpacking first

Based on the code in lib/shard/src/segment_holder/snapshot.rs, restore_unpacked_snapshot expects an unpacked directory, so the parameter naming here should reflect that.

🔎 Optional naming improvement
-    pub fn restore_snapshot(snapshot_path: &Path) -> CollectionResult<()> {
+    pub fn restore_snapshot(unpacked_snapshot_path: &Path) -> CollectionResult<()> {
         log::info!("Restoring shard snapshot {}", snapshot_path.display());
-        SegmentHolder::restore_unpacked_snapshot(snapshot_path)?;
+        SegmentHolder::restore_unpacked_snapshot(unpacked_snapshot_path)?;
         Ok(())
     }
lib/edge/src/snapshots.rs (1)

8-12: LGTM!

The implementation provides a clean Shard-level API for snapshot unpacking by delegating to SegmentHolder::unpack_snapshot. The function correctly handles archive extraction and restoration in a single call.

lib/collection/src/shards/local_shard/mod.rs (2)

50-50: LGTM!

The import of centralized path constants from shard::files improves code maintainability and reduces duplication across the codebase.


470-472: LGTM!

The delegation to shard::files::segments_path centralizes path construction logic while preserving the public API. This improves maintainability by ensuring consistent path handling across the codebase.

lib/edge/python/examples/restore-snapshot.py (1)

38-45: LGTM! Snapshot restoration and point retrieval workflow is correct.

The sequence of unpacking the snapshot, instantiating the shard, and retrieving points with specified parameters correctly demonstrates the Python Edge API usage.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
lib/edge/python/examples/restore-snapshot.py (1)

22-22: Add timeout to prevent indefinite hangs.

The requests.get call lacks a timeout parameter, which can cause the script to hang indefinitely if the server is unresponsive.

lib/shard/src/segment_holder/snapshot.rs (1)

52-76: Hidden file filtering is Unix-specific.

The dotfile convention (s.starts_with('.')) only detects hidden files on Unix systems.

🧹 Nitpick comments (5)
lib/edge/python/src/snapshots.rs (1)

53-63: Partial snapshot recovery is incomplete.

The todo!() at line 63 will panic at runtime if a partial snapshot recovery is attempted. The error handling to restore the shard on manifest retrieval failure is well-implemented.

Would you like me to help implement the partial recovery logic, or should a tracking issue be opened to complete this functionality?

lib/edge/python/examples/restore-snapshot.py (1)

7-7: Consider explicit import for clarity.

The star import makes it unclear which symbols come from qdrant_edge. For example scripts that serve as documentation, explicit imports improve readability.

-from qdrant_edge import *
+from qdrant_edge import Shard
lib/common/io/src/move_files.rs (1)

60-68: Doc comment is inconsistent with implementation.

The doc says "Print a warning if the file can't be removed," but the function doesn't print any warning—it returns an error instead.

🔎 Suggested fix
-/// Remove the file if it exists. Print a warning if the file can't be removed.
+/// Remove the file if it exists. Returns error if removal fails (except for NotFound).
 fn cleanup_file(path: &Path) -> io::Result<()> {
lib/shard/src/segment_holder/snapshot_manifest.rs (1)

142-158: Consider logging when a manifest is skipped due to lower version.

When add() returns false because the existing manifest has a higher version, this is silently ignored. For debugging purposes, a log::debug! might be helpful to trace manifest merging behavior.

lib/segment/src/segment/snapshot.rs (1)

116-118: Consider documenting why _get_segment_manifest is separate.

The public get_segment_manifest delegates to the private _get_segment_manifest. If this is to allow future extension or testing, a brief comment would clarify the design intent.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11821cb and f7bce19.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (35)
  • Cargo.toml
  • lib/collection/Cargo.toml
  • lib/collection/src/collection/snapshots.rs
  • lib/collection/src/shards/dummy_shard.rs
  • lib/collection/src/shards/forward_proxy_shard.rs
  • lib/collection/src/shards/local_shard/mod.rs
  • lib/collection/src/shards/local_shard/snapshot.rs
  • lib/collection/src/shards/proxy_shard.rs
  • lib/collection/src/shards/queue_proxy_shard.rs
  • lib/collection/src/shards/replica_set/snapshots.rs
  • lib/collection/src/shards/shard.rs
  • lib/collection/src/shards/shard_holder/mod.rs
  • lib/common/io/Cargo.toml
  • lib/common/io/src/lib.rs
  • lib/common/io/src/move_files.rs
  • lib/edge/python/Cargo.toml
  • lib/edge/python/examples/restore-snapshot.py
  • lib/edge/python/src/lib.rs
  • lib/edge/python/src/snapshots.rs
  • lib/edge/python/src/types/value.rs
  • lib/edge/python/src/utils.rs
  • lib/edge/src/lib.rs
  • lib/edge/src/snapshots.rs
  • lib/segment/Cargo.toml
  • lib/segment/src/data_types/manifest.rs
  • lib/segment/src/entry/snapshot_entry.rs
  • lib/segment/src/segment/snapshot.rs
  • lib/shard/src/files/mod.rs
  • lib/shard/src/proxy_segment/snapshot_entry.rs
  • lib/shard/src/segment_holder/mod.rs
  • lib/shard/src/segment_holder/snapshot.rs
  • lib/shard/src/segment_holder/snapshot_manifest.rs
  • lib/storage/src/content_manager/snapshots/recover.rs
  • src/actix/api/snapshot_api.rs
  • src/common/snapshots.rs
💤 Files with no reviewable changes (1)
  • lib/segment/src/data_types/manifest.rs
✅ Files skipped from review due to trivial changes (1)
  • lib/collection/src/shards/forward_proxy_shard.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/collection/src/shards/local_shard/snapshot.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust
In new code, don't use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice - use bytemuck or zerocopy crates instead
Prefer explicit exhaustive matches over catch-all _ arm in Rust match expressions, so we don't miss handling new enum variants when they are added later (exceptions: tests and benchmarks, or when sure new variants won't affect logic)
Prefer explicit field ignoring using : _ over using .. in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)

Files:

  • lib/shard/src/segment_holder/mod.rs
  • lib/segment/src/entry/snapshot_entry.rs
  • lib/edge/python/src/types/value.rs
  • lib/edge/src/snapshots.rs
  • lib/edge/src/lib.rs
  • lib/storage/src/content_manager/snapshots/recover.rs
  • lib/edge/python/src/snapshots.rs
  • lib/shard/src/proxy_segment/snapshot_entry.rs
  • lib/common/io/src/lib.rs
  • lib/common/io/src/move_files.rs
  • lib/edge/python/src/utils.rs
  • lib/shard/src/segment_holder/snapshot.rs
  • src/common/snapshots.rs
  • src/actix/api/snapshot_api.rs
  • lib/collection/src/shards/shard_holder/mod.rs
  • lib/collection/src/collection/snapshots.rs
  • lib/collection/src/shards/proxy_shard.rs
  • lib/collection/src/shards/queue_proxy_shard.rs
  • lib/collection/src/shards/local_shard/mod.rs
  • lib/collection/src/shards/dummy_shard.rs
  • lib/edge/python/src/lib.rs
  • lib/shard/src/segment_holder/snapshot_manifest.rs
  • lib/shard/src/files/mod.rs
  • lib/segment/src/segment/snapshot.rs
  • lib/collection/src/shards/replica_set/snapshots.rs
  • lib/collection/src/shards/shard.rs
🧠 Learnings (8)
📚 Learning: 2025-10-29T11:13:46.074Z
Learnt from: ffuugoo
Repo: qdrant/qdrant PR: 7465
File: lib/edge/python/src/types/value.rs:30-33
Timestamp: 2025-10-29T11:13:46.074Z
Learning: In lib/edge/python/src/types/value.rs, the HashMap transmute pattern in `PyValue::into_rust_map` that converts `HashMap<String, PyValue>` to `HashMap<String, serde_json::Value>` using unsafe transmute is acceptable and should not be flagged, as PyValue has transparent representation over serde_json::Value.

Applied to files:

  • lib/edge/python/src/types/value.rs
📚 Learning: 2025-10-13T22:58:03.121Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7400
File: lib/segment/src/id_tracker/simple_id_tracker.rs:234-241
Timestamp: 2025-10-13T22:58:03.121Z
Learning: SimpleIdTracker in lib/segment/src/id_tracker/simple_id_tracker.rs is being deprecated and should not receive fixes related to version tracking or recovery logic, as it has a different version storage structure that is incompatible with newer trackers.

Applied to files:

  • lib/storage/src/content_manager/snapshots/recover.rs
  • src/common/snapshots.rs
  • lib/collection/src/shards/shard_holder/mod.rs
  • lib/collection/src/collection/snapshots.rs
  • lib/collection/src/shards/dummy_shard.rs
  • lib/segment/src/segment/snapshot.rs
  • lib/collection/src/shards/shard.rs
📚 Learning: 2025-11-02T19:16:42.527Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7474
File: lib/collection/src/shards/transfer/stream_records.rs:46-55
Timestamp: 2025-11-02T19:16:42.527Z
Learning: In the Qdrant repository, do not flag compilation errors such as partial-move errors, type errors, or any syntax issues that would be caught by the Rust compiler during the build process.

Applied to files:

  • lib/shard/src/segment_holder/snapshot.rs
📚 Learning: 2025-04-22T23:17:41.734Z
Learnt from: xzfc
Repo: qdrant/qdrant PR: 6245
File: lib/segment/src/index/hnsw_index/point_scorer.rs:116-121
Timestamp: 2025-04-22T23:17:41.734Z
Learning: The method `is_none_or` exists in the Qdrant codebase and compiles correctly, despite not being part of standard Rust's Option type. Code reviews should verify compilation issues before reporting them.

Applied to files:

  • lib/shard/src/segment_holder/snapshot.rs
📚 Learning: 2025-11-25T11:56:33.176Z
Learnt from: CR
Repo: qdrant/qdrant PR: 0
File: .github/review-rules.md:0-0
Timestamp: 2025-11-25T11:56:33.176Z
Learning: Applies to **/*.rs : Prefer explicit field ignoring using `: _` over using `..` in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)

Applied to files:

  • lib/shard/src/segment_holder/snapshot.rs
📚 Learning: 2025-07-11T11:35:21.549Z
Learnt from: generall
Repo: qdrant/qdrant PR: 6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using `id.to_string().parse().unwrap()` is acceptable for testing purposes and doesn't need to match production code's `id_tracker.internal_id()` approach. Test code can use mock implementations that serve the testing goals.

Applied to files:

  • lib/collection/src/shards/proxy_shard.rs
  • lib/segment/src/segment/snapshot.rs
  • lib/collection/src/shards/shard.rs
📚 Learning: 2025-08-10T18:30:02.986Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.

Applied to files:

  • lib/collection/src/shards/local_shard/mod.rs
📚 Learning: 2025-09-01T11:42:06.964Z
Learnt from: timvisee
Repo: qdrant/qdrant PR: 7157
File: lib/shard/src/segment_holder/mod.rs:808-814
Timestamp: 2025-09-01T11:42:06.964Z
Learning: In Qdrant's segment holder, panicking when no segments exist during flush_all is intentional and preferred over graceful error handling, as having zero segments could permanently corrupt the WAL by acknowledging u64::MAX. The maintainers consider this condition impossible and use the panic as a fail-fast safety mechanism to prevent data corruption.

Applied to files:

  • lib/collection/src/shards/shard.rs
🧬 Code graph analysis (17)
lib/segment/src/entry/snapshot_entry.rs (2)
lib/segment/src/segment/snapshot.rs (2)
  • segment_id (30-49)
  • get_segment_manifest (116-118)
lib/shard/src/proxy_segment/snapshot_entry.rs (2)
  • segment_id (12-14)
  • get_segment_manifest (34-36)
lib/edge/src/snapshots.rs (3)
lib/edge/src/lib.rs (1)
  • path (178-180)
lib/collection/src/shards/shard.rs (1)
  • snapshot_manifest (166-174)
lib/shard/src/segment_holder/snapshot.rs (2)
  • snapshot_manifest (23-32)
  • unpack_snapshot (38-47)
lib/storage/src/content_manager/snapshots/recover.rs (4)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/collection/src/shards/local_shard/snapshot.rs (1)
  • snapshot_manifest (29-38)
lib/collection/src/shards/shard.rs (1)
  • snapshot_manifest (166-174)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (23-32)
lib/edge/python/src/snapshots.rs (6)
lib/edge/src/lib.rs (2)
  • path (178-180)
  • load (38-172)
lib/shard/src/files/mod.rs (2)
  • clear_data (57-83)
  • move_data (85-120)
lib/edge/python/src/lib.rs (3)
  • snapshot_manifest (128-131)
  • unpack_snapshot (123-126)
  • load (82-85)
lib/edge/src/snapshots.rs (2)
  • snapshot_manifest (14-16)
  • unpack_snapshot (10-12)
lib/shard/src/segment_holder/snapshot.rs (2)
  • snapshot_manifest (23-32)
  • unpack_snapshot (38-47)
lib/shard/src/segment_holder/snapshot_manifest.rs (1)
  • load_from_snapshot (24-117)
lib/shard/src/proxy_segment/snapshot_entry.rs (2)
lib/segment/src/entry/snapshot_entry.rs (3)
  • segment_id (11-11)
  • take_snapshot (17-23)
  • get_segment_manifest (25-25)
lib/segment/src/segment/snapshot.rs (3)
  • segment_id (30-49)
  • take_snapshot (51-114)
  • get_segment_manifest (116-118)
lib/edge/python/src/utils.rs (1)
lib/edge/python/src/lib.rs (2)
  • from (150-152)
  • from (156-158)
lib/shard/src/segment_holder/snapshot.rs (2)
lib/segment/src/common/validate_snapshot_archive.rs (2)
  • validate_snapshot_archive (15-48)
  • open_snapshot_archive (50-65)
lib/shard/src/files/mod.rs (1)
  • segments_path (21-23)
src/common/snapshots.rs (3)
lib/edge/src/snapshots.rs (1)
  • snapshot_manifest (14-16)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (23-32)
src/actix/api/snapshot_api.rs (5)
lib/edge/python/src/lib.rs (1)
  • snapshot_manifest (128-131)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/collection/src/shards/local_shard/snapshot.rs (1)
  • snapshot_manifest (29-38)
lib/collection/src/shards/shard.rs (1)
  • snapshot_manifest (166-174)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (23-32)
lib/collection/src/shards/shard_holder/mod.rs (5)
lib/edge/python/src/lib.rs (1)
  • snapshot_manifest (128-131)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/collection/src/shards/forward_proxy_shard.rs (1)
  • snapshot_manifest (346-348)
lib/collection/src/shards/shard.rs (1)
  • snapshot_manifest (166-174)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (23-32)
lib/collection/src/shards/proxy_shard.rs (7)
lib/edge/src/snapshots.rs (1)
  • snapshot_manifest (14-16)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/collection/src/shards/forward_proxy_shard.rs (1)
  • snapshot_manifest (346-348)
lib/collection/src/shards/local_shard/snapshot.rs (1)
  • snapshot_manifest (29-38)
lib/collection/src/shards/queue_proxy_shard.rs (1)
  • snapshot_manifest (157-162)
lib/collection/src/shards/shard.rs (1)
  • snapshot_manifest (166-174)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (23-32)
lib/collection/src/shards/queue_proxy_shard.rs (8)
lib/edge/python/src/lib.rs (1)
  • snapshot_manifest (128-131)
lib/edge/src/snapshots.rs (1)
  • snapshot_manifest (14-16)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/collection/src/shards/forward_proxy_shard.rs (1)
  • snapshot_manifest (346-348)
lib/collection/src/shards/local_shard/snapshot.rs (1)
  • snapshot_manifest (29-38)
lib/collection/src/shards/proxy_shard.rs (1)
  • snapshot_manifest (90-92)
lib/collection/src/shards/shard.rs (1)
  • snapshot_manifest (166-174)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (23-32)
lib/collection/src/shards/local_shard/mod.rs (1)
lib/shard/src/files/mod.rs (6)
  • get_shard_data_files (41-48)
  • check_data (51-55)
  • wal_path (26-28)
  • segments_path (21-23)
  • newest_clocks_path (31-33)
  • oldest_clocks_path (36-38)
lib/collection/src/shards/dummy_shard.rs (3)
lib/collection/src/shards/forward_proxy_shard.rs (1)
  • snapshot_manifest (346-348)
lib/collection/src/shards/shard.rs (1)
  • snapshot_manifest (166-174)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (23-32)
lib/shard/src/files/mod.rs (2)
lib/shard/src/segment_holder/snapshot.rs (1)
  • fs (54-54)
lib/common/io/src/move_files.rs (2)
  • move_dir (10-32)
  • move_file (36-58)
lib/collection/src/shards/replica_set/snapshots.rs (1)
lib/shard/src/segment_holder/snapshot_manifest.rs (1)
  • load_from_snapshot (24-117)
lib/collection/src/shards/shard.rs (4)
lib/edge/src/snapshots.rs (1)
  • snapshot_manifest (14-16)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/collection/src/shards/local_shard/snapshot.rs (1)
  • snapshot_manifest (29-38)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (23-32)
🪛 Ruff (0.14.10)
lib/edge/python/examples/restore-snapshot.py

7-7: from qdrant_edge import * used; unable to detect undefined names

(F403)


22-22: Probable use of requests call without timeout

(S113)


38-38: Shard may be undefined, or defined from star imports

(F405)


40-40: Shard may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: integration-tests
  • GitHub Check: e2e-tests
  • GitHub Check: test-consistency
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests-consensus
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: lint
  • GitHub Check: Test Python bindings
🔇 Additional comments (50)
lib/edge/python/src/types/value.rs (1)

25-27: LGTM! Clean constructor for wrapping JSON values.

This constructor complements the existing Into<serde_json::Value> trait and provides an explicit way to construct PyValue from deserialized JSON, which aligns well with the snapshot restoration use case described in the PR.

lib/collection/Cargo.toml (1)

85-85: LGTM: Workspace dependency migration.

The migration of fs_extra to workspace-managed dependency promotes version consistency across the workspace.

lib/edge/python/Cargo.toml (1)

35-35: LGTM: Appropriate dependency for snapshot operations.

The tempfile dependency addition aligns with the PR objective of implementing snapshot restoration in Edge Python bindings.

lib/edge/src/lib.rs (2)

5-5: LGTM: Private module for snapshot support.

The addition of a private snapshots module appropriately encapsulates snapshot-related functionality without expanding the public API surface.


28-28: LGTM: Path field now exposed via accessor.

The changes correctly expose the shard's path through a public accessor:

  • Field renamed from _path to path (indicating active use)
  • Initialization updated accordingly
  • Standard Rust accessor pattern implemented

This API addition is useful for snapshot operations and diagnostics.

Also applies to: 165-165, 177-180

lib/segment/Cargo.toml (1)

84-84: LGTM: Workspace dependency migration.

Consistent with the workspace-level dependency management pattern established in this PR.

Cargo.toml (1)

221-221: fs_extra 1.3.0 is the current stable release.

Version 1.3.0 is the latest stable release on crates.io (published 2023-02-03) with no known security advisories.

lib/shard/src/files/mod.rs (2)

1-120: LGTM! Clean centralization of shard data-file path handling.

This new module provides a well-structured approach to managing shard data files by consolidating path resolution, existence checks, and move/clear operations. The defensive existence checks before file operations (lines 66-80, 103-117) are good practice, and error handling is appropriate.


51-55: Note: check_data only verifies WAL and segments, not clock files.

This function checks for the existence of WAL and segments directories but not the clock files (newest_clocks_path and oldest_clocks_path). This is likely intentional since clock files are optional, but ensure this behavior aligns with your data validation requirements.

lib/collection/src/shards/local_shard/mod.rs (5)

141-169: LGTM! Clean refactor to use centralized shard data-file helpers.

The move_data function now properly delegates to shard::files::get_shard_data_files and includes conditional moves for clock files (lines 161-167). The explicit field destructuring follows coding guidelines.


174-174: Good delegation to centralized helper.

The check_data function now delegates to shard::files::check_data, maintaining consistency with the new centralized path handling.


181-207: LGTM! Clean refactor with proper field destructuring.

The clear function now uses shard::files::get_shard_data_files with explicit field destructuring that follows coding guidelines. The conditional removal of clock files (lines 198-204) is defensive and correct.


493-499: Good delegation to shard::files helpers.

Both wal_path and segments_path now properly delegate to the centralized path resolution functions, improving maintainability.


1163-1169: LGTM! Clock path helpers properly delegate to shard::files.

The newest_clocks_path and oldest_clocks_path functions now use the centralized helpers from shard::files, maintaining consistency across the codebase.

lib/common/io/src/lib.rs (1)

2-2: LGTM! Clean addition of the move_files module.

The new move_files module is properly exposed as public, enabling cross-filesystem-safe file and directory moves throughout the codebase.

lib/shard/src/segment_holder/mod.rs (1)

3-3: LGTM! Clean addition of the snapshot_manifest module.

The new snapshot_manifest module is properly exposed as public, supporting the broader refactor to centralize snapshot manifest handling at the shard level.

src/common/snapshots.rs (1)

13-13: LGTM! Import path correctly updated to new module location.

The import of RecoveryType and SnapshotManifest has been properly updated to reflect the new module structure at shard::segment_holder::snapshot_manifest, aligning with the broader refactor to centralize snapshot manifest handling.

lib/common/io/Cargo.toml (1)

19-19: LGTM!

The addition of fs_extra as a workspace dependency is appropriate for supporting cross-filesystem file operations.

lib/collection/src/shards/dummy_shard.rs (1)

17-17: LGTM!

The import path update for SnapshotManifest is consistent with the broader refactor to consolidate snapshot manifest handling in the shard crate. The usage remains unchanged.

lib/collection/src/shards/shard.rs (1)

13-13: LGTM!

The import path update for SnapshotManifest aligns with the refactor to centralize snapshot-related types in the shard::segment_holder::snapshot_manifest module.

lib/storage/src/content_manager/snapshots/recover.rs (1)

14-14: LGTM!

The import path update for RecoveryType is consistent with the refactor to consolidate snapshot manifest functionality in the shard crate.

lib/edge/src/snapshots.rs (1)

1-17: LGTM!

The snapshot functionality for Edge bindings is well-implemented:

  • unpack_snapshot properly delegates to SegmentHolder::unpack_snapshot for extracting snapshot archives
  • snapshot_manifest correctly uses a read lock and delegates to the segments holder
  • Both methods use appropriate error handling via OperationResult
  • The implementation follows clean delegation patterns without duplicating logic

This successfully exposes the necessary snapshot operations to the Edge Python bindings as intended.

lib/collection/src/collection/snapshots.rs (1)

11-11: LGTM! Import path update aligns with manifest refactoring.

The import path change for SnapshotManifest and RecoveryType to shard::segment_holder::snapshot_manifest is consistent with the broader refactoring to centralize manifest handling in the shard module.

lib/collection/src/shards/shard_holder/mod.rs (1)

25-25: LGTM! Import path update is consistent with the refactoring.

The updated import path for SnapshotManifest and RecoveryType aligns with the centralized manifest handling architecture introduced in this PR.

lib/collection/src/shards/proxy_shard.rs (1)

20-20: LGTM! Import path update is correct.

The import path change for SnapshotManifest is consistent with the module reorganization across the codebase.

lib/edge/python/src/utils.rs (1)

1-16: LGTM! Well-designed accessor method with appropriate error handling.

The get_shard() method provides a clean, centralized way to access the inner Shard from Python bindings with proper error handling. The implementation correctly:

  • Returns a reference for temporary borrowing
  • Provides a clear error message when the shard is not initialized
  • Uses appropriate error conversion through PyError

This improves the consistency of shard access across Python binding methods.

lib/collection/src/shards/queue_proxy_shard.rs (1)

20-20: LGTM! Import path update is correct.

The import path change for SnapshotManifest is consistent with the module reorganization to centralize snapshot manifest handling in the shard module.

src/actix/api/snapshot_api.rs (1)

22-22: LGTM!

The import path update correctly consolidates SnapshotManifest and RecoveryType under the new shard::segment_holder::snapshot_manifest module, aligning with the broader refactor in this PR.

lib/edge/python/src/snapshots.rs (2)

18-24: LGTM!

Good fallback chain for determining the temporary directory: user-provided → snapshot parent → system temp. This provides flexibility while ensuring a sensible default.


41-51: LGTM!

The full recovery flow correctly sequences operations: drop the existing shard to release resources, clear old data, move unpacked snapshot data into place, and reload the shard.

lib/segment/src/entry/snapshot_entry.rs (1)

6-6: LGTM!

The trait refactoring appropriately shifts to per-segment manifest handling:

  • SegmentManifest import replaces SnapshotManifest
  • New segment_id() method for segment identification
  • Updated take_snapshot signature with Option<&SegmentManifest>
  • New get_segment_manifest() accessor for retrieving per-segment manifests

This aligns well with the centralized manifest handling in shard::segment_holder::snapshot_manifest.

Also applies to: 10-11, 22-22, 25-25

lib/shard/src/proxy_segment/snapshot_entry.rs (1)

5-5: LGTM!

The ProxySegment implementation correctly delegates all new and updated SnapshotEntry trait methods to the wrapped segment:

  • segment_id() delegation
  • Updated take_snapshot with SegmentManifest parameter
  • get_segment_manifest() delegation

The proxy pattern is maintained consistently.

Also applies to: 12-14, 21-21, 34-36

lib/edge/python/examples/restore-snapshot.py (1)

38-47: LGTM!

The example script clearly demonstrates the snapshot restoration workflow: unpack the snapshot, instantiate a shard from the recovered path, and retrieve points. This serves as good documentation for the new Python bindings.

lib/collection/src/shards/replica_set/snapshots.rs (2)

5-7: LGTM!

Import updates correctly reference the new module structure for RecoveryType and SnapshotManifest.


104-105: LGTM!

The manifest loading logic is now cleanly delegated to SnapshotManifest::load_from_snapshot, which centralizes the segment iteration, manifest validation, and aggregation logic. This significantly reduces code duplication and improves maintainability.

lib/shard/src/segment_holder/snapshot.rs (3)

6-6: LGTM!

Import updates correctly bring in the required modules for the new snapshot unpacking and restoration functionality.

Also applies to: 10-10, 12-12, 15-15, 19-19


23-32: LGTM!

The snapshot_manifest method now correctly uses the per-segment manifest API: calling get_segment_manifest() on each segment and accumulating via manifest.add(). This aligns with the new manifest handling architecture.


34-47: LGTM!

The unpack_snapshot method correctly implements the unpacking workflow:

  1. Creates target directory if needed
  2. Opens the snapshot archive with proper error handling
  3. Unpacks the archive
  4. Delegates to restore_unpacked_snapshot for segment restoration

The use of open_snapshot_archive from the segment crate ensures consistent archive handling.

lib/common/io/src/move_files.rs (2)

10-32: LGTM! Cross-filesystem directory move implementation looks correct.

The implementation correctly tries a fast rename first and falls back to fs_extra::dir::move_dir with content_only and overwrite options, which handles cross-filesystem moves properly.


36-58: Consider whether partial copy should be cleaned up on remove_file failure.

On line 52-55, when fs::remove_file(from) fails after a successful copy, the code cleans up the destination file. This means a successful copy is deleted if the source can't be removed. Depending on the use case, it might be safer to leave the destination intact and return the error, since the data has been successfully copied.

However, if the intent is to ensure atomic-like behavior (either the move fully succeeds or nothing changes), the current approach is reasonable.

lib/shard/src/segment_holder/snapshot_manifest.rs (3)

58-84: Recovery type validation logic is well-structured.

The match arms correctly enforce:

  • None: Tolerates missing manifests (backward compatibility)
  • Full: Rejects manifests (shard snapshots shouldn't have segment manifests)
  • Partial: Requires manifests (partial snapshots must have them)

Clear error messages guide users to the correct endpoint.


119-134: Validation logic looks correct.

The check ensures no file version exceeds its segment version, which would indicate data corruption or an invalid manifest.


177-191: LGTM! Clean enum with helper methods.

lib/segment/src/segment/snapshot.rs (5)

26-28: LGTM! Clear constant for manifest filename.


30-49: Good use of debug_assert for UUID validation.

The assertion validates the segment ID format in debug builds without impacting release performance. The error handling for missing/invalid path components is appropriate.


62-91: Clean refactor of include logic with Option handling.

The include_files_opt pattern cleanly separates the manifest-present case (include only updated files) from the no-manifest case (include everything). The closure correctly defaults to true when no manifest is provided.


390-425: File comparison logic for partial snapshots is well-reasoned.

The three inclusion criteria are clearly documented:

  1. New files not in old manifest
  2. Files with newer versions
  3. Both at version 0 (can't distinguish new empty vs. operation 0)

The edge case at version 0 is a pragmatic workaround for the ambiguity.


202-206: Direct assignment of segment_id is cleaner.

Per the coding guidelines preferring explicit conversions, the direct segment_id assignment (without .into()) aligns with the project conventions.

lib/edge/python/src/lib.rs (2)

133-141: Implementation verified and error handling is correct.

The _update_from_snapshot implementation properly handles the case when the shard is already taken. It uses mem::take to take ownership (line 35), checks if the shard is initialized, and returns an appropriate error ("Shard is not initialized") if the shard is None (lines 35-37). Error paths also correctly restore the shard before returning errors (lines 56-58).


77-84: The get_shard() helper properly returns an error when shard is None.

The implementation in utils.rs correctly checks if the wrapped shard exists and returns Err(PyError::from(OperationError::service_error("Shard is not initialized"))) if None. All calling methods properly use the ? operator to propagate this error. No action needed here.

However, line 130 has a minor issue: serde_json::to_value(&manifest).unwrap() can panic if serialization fails. Use ? operator instead:

let manifest = self.get_shard()?.snapshot_manifest()?;
Ok(PyValue::new(serde_json::to_value(&manifest)?))
⛔ Skipped due to learnings
Learnt from: ffuugoo
Repo: qdrant/qdrant PR: 7465
File: lib/edge/python/src/types/value.rs:30-33
Timestamp: 2025-10-29T11:13:46.074Z
Learning: In lib/edge/python/src/types/value.rs, the HashMap transmute pattern in `PyValue::into_rust_map` that converts `HashMap<String, PyValue>` to `HashMap<String, serde_json::Value>` using unsafe transmute is acceptable and should not be flagged, as PyValue has transparent representation over serde_json::Value.
Learnt from: xzfc
Repo: qdrant/qdrant PR: 6245
File: lib/segment/src/index/hnsw_index/point_scorer.rs:116-121
Timestamp: 2025-04-22T23:17:41.734Z
Learning: The method `is_none_or` exists in the Qdrant codebase and compiles correctly, despite not being part of standard Rust's Option type. Code reviews should verify compilation issues before reporting them.

Comment on lines +128 to +131
pub fn snapshot_manifest(&self) -> Result<PyValue> {
let manifest = self.get_shard()?.snapshot_manifest()?;
Ok(PyValue::new(serde_json::to_value(&manifest).unwrap()))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid .unwrap() on serialization—return a proper error instead.

serde_json::to_value(&manifest).unwrap() can panic if serialization fails. Although SegmentManifest should always be serializable, using unwrap() in Python-facing code risks crashing the Python runtime.

🔎 Suggested fix
     pub fn snapshot_manifest(&self) -> Result<PyValue> {
         let manifest = self.get_shard()?.snapshot_manifest()?;
-        Ok(PyValue::new(serde_json::to_value(&manifest).unwrap()))
+        let value = serde_json::to_value(&manifest).map_err(|err| {
+            OperationError::service_error(format!("failed to serialize manifest: {err}"))
+        })?;
+        Ok(PyValue::new(value))
     }
🤖 Prompt for AI Agents
In lib/edge/python/src/lib.rs around lines 128–131, the code calls
serde_json::to_value(&manifest).unwrap(), which can panic; replace the unwrap
with proper error propagation by mapping serde_json::to_value's Result into the
function's Result type (or converting the serde_json::Error into an appropriate
Python/Rust error) and return Err on failure instead of panicking, then wrap the
successful value in PyValue and return Ok.

Comment on lines +46 to +49
let segment_id = segment_path
.file_name()
.and_then(|segment_id| segment_id.to_str())
.expect("segment path ends with a valid segment id");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Replace .expect() with proper error handling.

If segment_path.file_name() returns None or the filename isn't valid UTF-8, this will panic. Given this processes external snapshot data, defensive error handling is warranted.

🔎 Suggested fix
             let segment_id = segment_path
                 .file_name()
                 .and_then(|segment_id| segment_id.to_str())
-                .expect("segment path ends with a valid segment id");
+                .ok_or_else(|| {
+                    OperationError::validation_error(format!(
+                        "invalid segment path: {}",
+                        segment_path.display(),
+                    ))
+                })?;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let segment_id = segment_path
.file_name()
.and_then(|segment_id| segment_id.to_str())
.expect("segment path ends with a valid segment id");
let segment_id = segment_path
.file_name()
.and_then(|segment_id| segment_id.to_str())
.ok_or_else(|| {
OperationError::validation_error(format!(
"invalid segment path: {}",
segment_path.display(),
))
})?;

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
lib/shard/src/snapshots/snapshot_manifest.rs (1)

46-49: Replace .expect() with proper error handling.

This was flagged in a previous review. If segment_path.file_name() returns None or the filename isn't valid UTF-8, this will panic. Given this processes external snapshot data, defensive error handling is warranted.

🔎 Suggested fix
             let segment_id = segment_path
                 .file_name()
                 .and_then(|segment_id| segment_id.to_str())
-                .expect("segment path ends with a valid segment id");
+                .ok_or_else(|| {
+                    OperationError::validation_error(format!(
+                        "invalid segment path: {}",
+                        segment_path.display(),
+                    ))
+                })?;
🧹 Nitpick comments (5)
lib/common/io/src/move_files.rs (2)

12-34: Consider consolidating with existing implementation.

This implementation closely mirrors the async version in lib/collection/src/common/file_utils.rs. While the synchronous vs. async distinction justifies separate implementations, consider whether shared logic (error messages, options configuration) could be extracted to reduce duplication.


72-112: Consider adding test coverage for move_file.

The test for move_dir is comprehensive, but move_file lacks test coverage. Consider adding a test to verify the copy-and-remove fallback behavior and cleanup on failures.

lib/edge/python/src/snapshots.rs (1)

13-71: Consider documenting the error state contract.

The _update_from_snapshot method uses mem::take to extract the shard at line 35, which means any error after this point can potentially leave self.0 as None. While this is addressed for current_manifest errors (lines 55-58), other error paths do not restore the shard.

Consider adding documentation (doc comment) that clearly states:

  • The method's cancel safety properties (appears to be NOT cancel safe)
  • Under what error conditions self.0 may remain None
  • Whether callers should treat the PyShard instance as invalid after certain errors

This would help Python binding users understand the error semantics and when they need to reinitialize the shard.

lib/edge/src/snapshots.rs (1)

55-61: Consider handling non-existent files gracefully during deletion.

If a file or directory in the delete lists doesn't exist (e.g., already removed or never created), fs::remove_file and fs::remove_dir_all will return an error. Depending on the expected invariants, you may want to ignore NotFound errors.

🔎 Proposed fix to ignore NotFound errors
         for delete_file in delete_files {
-            fs::remove_file(&delete_file)?;
+            if let Err(e) = fs::remove_file(&delete_file) {
+                if e.kind() != std::io::ErrorKind::NotFound {
+                    return Err(e.into());
+                }
+            }
         }

         for delete_dir in delete_directories {
-            fs::remove_dir_all(&delete_dir)?;
+            if let Err(e) = fs::remove_dir_all(&delete_dir) {
+                if e.kind() != std::io::ErrorKind::NotFound {
+                    return Err(e.into());
+                }
+            }
         }
lib/shard/src/snapshots/snapshot_manifest.rs (1)

30-31: snapshot_segments HashSet appears unused beyond debug_assert.

The snapshot_segments HashSet is populated but only used in a debug_assert!. If it's not needed for production logic, consider removing it to avoid unnecessary allocations, or document its purpose if it serves as a debugging aid.

🔎 Proposed simplification
-        let mut snapshot_segments = HashSet::new();
         let mut snapshot_manifest = SnapshotManifest::default();

         for segment_entry in fs::read_dir(segments_path)? {
             // ... existing code ...

-            let added = snapshot_segments.insert(segment_id.to_string());
-            debug_assert!(added);
+            // segment_id uniqueness is guaranteed by filesystem directory structure

Alternatively, if this is intentional for debugging, add a brief comment explaining its purpose.

Also applies to: 51-52

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f7bce19 and 61d8738.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (23)
  • lib/collection/src/collection/snapshots.rs
  • lib/collection/src/common/file_utils.rs
  • lib/collection/src/shards/dummy_shard.rs
  • lib/collection/src/shards/forward_proxy_shard.rs
  • lib/collection/src/shards/local_shard/snapshot.rs
  • lib/collection/src/shards/proxy_shard.rs
  • lib/collection/src/shards/queue_proxy_shard.rs
  • lib/collection/src/shards/replica_set/snapshots.rs
  • lib/collection/src/shards/shard.rs
  • lib/collection/src/shards/shard_holder/mod.rs
  • lib/common/io/Cargo.toml
  • lib/common/io/src/move_files.rs
  • lib/edge/Cargo.toml
  • lib/edge/python/src/snapshots.rs
  • lib/edge/src/snapshots.rs
  • lib/shard/src/lib.rs
  • lib/shard/src/segment_holder/snapshot.rs
  • lib/shard/src/snapshots/mod.rs
  • lib/shard/src/snapshots/snapshot_manifest.rs
  • lib/shard/src/snapshots/snapshot_utils.rs
  • lib/storage/src/content_manager/snapshots/recover.rs
  • src/actix/api/snapshot_api.rs
  • src/common/snapshots.rs
✅ Files skipped from review due to trivial changes (1)
  • lib/collection/src/common/file_utils.rs
🚧 Files skipped from review as they are similar to previous changes (10)
  • lib/shard/src/segment_holder/snapshot.rs
  • lib/shard/src/lib.rs
  • src/actix/api/snapshot_api.rs
  • lib/storage/src/content_manager/snapshots/recover.rs
  • lib/collection/src/shards/queue_proxy_shard.rs
  • lib/common/io/Cargo.toml
  • lib/collection/src/shards/forward_proxy_shard.rs
  • lib/collection/src/shards/dummy_shard.rs
  • lib/collection/src/shards/shard_holder/mod.rs
  • lib/collection/src/collection/snapshots.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust
In new code, don't use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice - use bytemuck or zerocopy crates instead
Prefer explicit exhaustive matches over catch-all _ arm in Rust match expressions, so we don't miss handling new enum variants when they are added later (exceptions: tests and benchmarks, or when sure new variants won't affect logic)
Prefer explicit field ignoring using : _ over using .. in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)

Files:

  • lib/collection/src/shards/shard.rs
  • lib/shard/src/snapshots/mod.rs
  • lib/common/io/src/move_files.rs
  • lib/shard/src/snapshots/snapshot_manifest.rs
  • lib/collection/src/shards/replica_set/snapshots.rs
  • lib/edge/src/snapshots.rs
  • src/common/snapshots.rs
  • lib/shard/src/snapshots/snapshot_utils.rs
  • lib/edge/python/src/snapshots.rs
  • lib/collection/src/shards/local_shard/snapshot.rs
  • lib/collection/src/shards/proxy_shard.rs
🧠 Learnings (6)
📚 Learning: 2025-10-13T22:58:03.121Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7400
File: lib/segment/src/id_tracker/simple_id_tracker.rs:234-241
Timestamp: 2025-10-13T22:58:03.121Z
Learning: SimpleIdTracker in lib/segment/src/id_tracker/simple_id_tracker.rs is being deprecated and should not receive fixes related to version tracking or recovery logic, as it has a different version storage structure that is incompatible with newer trackers.

Applied to files:

  • lib/collection/src/shards/shard.rs
  • lib/shard/src/snapshots/snapshot_manifest.rs
  • lib/collection/src/shards/replica_set/snapshots.rs
📚 Learning: 2025-07-11T11:35:21.549Z
Learnt from: generall
Repo: qdrant/qdrant PR: 6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using `id.to_string().parse().unwrap()` is acceptable for testing purposes and doesn't need to match production code's `id_tracker.internal_id()` approach. Test code can use mock implementations that serve the testing goals.

Applied to files:

  • lib/collection/src/shards/shard.rs
  • lib/shard/src/snapshots/snapshot_manifest.rs
📚 Learning: 2025-08-18T10:56:43.707Z
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7043
File: lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs:340-347
Timestamp: 2025-08-18T10:56:43.707Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs, the create_offsets_file_from_iter function intentionally requires paths to have a parent directory and returns an error if path.parent() is None. This was a conscious design decision to ensure proper path validation.

Applied to files:

  • lib/shard/src/snapshots/snapshot_manifest.rs
📚 Learning: 2025-09-01T11:42:06.964Z
Learnt from: timvisee
Repo: qdrant/qdrant PR: 7157
File: lib/shard/src/segment_holder/mod.rs:808-814
Timestamp: 2025-09-01T11:42:06.964Z
Learning: In Qdrant's segment holder, panicking when no segments exist during flush_all is intentional and preferred over graceful error handling, as having zero segments could permanently corrupt the WAL by acknowledging u64::MAX. The maintainers consider this condition impossible and use the panic as a fail-fast safety mechanism to prevent data corruption.

Applied to files:

  • lib/shard/src/snapshots/snapshot_manifest.rs
  • lib/collection/src/shards/local_shard/snapshot.rs
📚 Learning: 2025-05-19T09:25:54.285Z
Learnt from: timvisee
Repo: qdrant/qdrant PR: 6546
File: lib/segment/src/index/field_index/full_text_index/text_index.rs:497-518
Timestamp: 2025-05-19T09:25:54.285Z
Learning: In the Qdrant project, using `.expect()` and `.unwrap()` is considered acceptable in test code since test failures should be loud and obvious, and proper error propagation adds unnecessary complexity to test code.

Applied to files:

  • lib/shard/src/snapshots/snapshot_manifest.rs
📚 Learning: 2025-06-02T18:10:47.203Z
Learnt from: coszio
Repo: qdrant/qdrant PR: 6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec<T>, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::<T>() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Applied to files:

  • lib/collection/src/shards/local_shard/snapshot.rs
🧬 Code graph analysis (8)
lib/collection/src/shards/shard.rs (8)
lib/edge/src/snapshots.rs (1)
  • snapshot_manifest (15-17)
lib/collection/src/shards/local_shard/snapshot.rs (1)
  • snapshot_manifest (30-39)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (20-29)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/collection/src/shards/forward_proxy_shard.rs (1)
  • snapshot_manifest (346-348)
lib/collection/src/shards/proxy_shard.rs (1)
  • snapshot_manifest (90-92)
lib/collection/src/shards/queue_proxy_shard.rs (1)
  • snapshot_manifest (157-162)
lib/edge/python/src/lib.rs (1)
  • snapshot_manifest (128-131)
lib/shard/src/snapshots/mod.rs (8)
lib/edge/src/snapshots.rs (1)
  • snapshot_manifest (15-17)
lib/collection/src/shards/local_shard/snapshot.rs (1)
  • snapshot_manifest (30-39)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (20-29)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/collection/src/shards/forward_proxy_shard.rs (1)
  • snapshot_manifest (346-348)
lib/collection/src/shards/proxy_shard.rs (1)
  • snapshot_manifest (90-92)
lib/collection/src/shards/queue_proxy_shard.rs (1)
  • snapshot_manifest (157-162)
lib/collection/src/shards/shard.rs (1)
  • snapshot_manifest (166-174)
lib/common/io/src/move_files.rs (1)
lib/collection/src/common/file_utils.rs (3)
  • move_dir (20-79)
  • move_file (83-114)
  • cleanup_file (117-123)
lib/shard/src/snapshots/snapshot_manifest.rs (3)
lib/shard/src/snapshots/snapshot_utils.rs (1)
  • fs (34-34)
lib/edge/src/snapshots.rs (1)
  • snapshot_manifest (15-17)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (20-29)
lib/collection/src/shards/replica_set/snapshots.rs (1)
lib/shard/src/snapshots/snapshot_manifest.rs (1)
  • load_from_snapshot (24-117)
lib/edge/src/snapshots.rs (2)
lib/shard/src/snapshots/snapshot_utils.rs (3)
  • fs (34-34)
  • unpack_snapshot (18-27)
  • partial_snapshot_merge_plan (62-125)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (20-29)
lib/shard/src/snapshots/snapshot_utils.rs (2)
lib/segment/src/common/validate_snapshot_archive.rs (2)
  • validate_snapshot_archive (15-48)
  • open_snapshot_archive (50-65)
lib/shard/src/files/mod.rs (1)
  • get_shard_data_files (41-48)
lib/collection/src/shards/proxy_shard.rs (8)
lib/edge/src/snapshots.rs (1)
  • snapshot_manifest (15-17)
lib/collection/src/shards/local_shard/snapshot.rs (1)
  • snapshot_manifest (30-39)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (20-29)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/collection/src/shards/forward_proxy_shard.rs (1)
  • snapshot_manifest (346-348)
lib/collection/src/shards/queue_proxy_shard.rs (1)
  • snapshot_manifest (157-162)
lib/collection/src/shards/shard.rs (1)
  • snapshot_manifest (166-174)
lib/edge/python/src/lib.rs (1)
  • snapshot_manifest (128-131)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Test Python bindings
  • GitHub Check: lint
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: e2e-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: integration-tests
  • GitHub Check: test-consistency
🔇 Additional comments (23)
lib/edge/Cargo.toml (1)

13-13: LGTM!

Clean addition of the io dependency to support cross-filesystem move operations for snapshot handling.

lib/common/io/src/move_files.rs (2)

38-60: LGTM! Cleanup handling is appropriate.

The best-effort cleanup on lines 50 and 55 correctly prioritizes propagating the original error over reporting cleanup failures. This matches the pattern from lib/collection/src/common/file_utils.rs.


63-70: LGTM!

The cleanup_file helper correctly tolerates NotFound errors while propagating actual failures.

lib/shard/src/snapshots/mod.rs (1)

1-2: LGTM! Clean module organization.

The module declarations properly expose the snapshot infrastructure. This aligns with the broader refactoring to consolidate snapshot handling under the shard crate.

lib/collection/src/shards/shard.rs (1)

13-13: LGTM! Import path correctly updated.

The import path has been updated to reflect the consolidation of SnapshotManifest under the shard crate. This change is consistent with the broader refactoring effort and doesn't affect functionality.

src/common/snapshots.rs (1)

13-13: LGTM! Import paths aligned with refactoring.

Both RecoveryType and SnapshotManifest have been correctly relocated to the shard snapshot module, maintaining consistency with the broader codebase reorganization.

lib/collection/src/shards/proxy_shard.rs (1)

20-20: LGTM! Import path updated consistently.

The SnapshotManifest import has been updated to match the new module structure, consistent with other proxy shard implementations in the codebase.

lib/edge/src/snapshots.rs (2)

10-17: LGTM!

Clean delegation methods that forward to the appropriate utilities. The API is well-structured.


32-38: Explicit field destructuring is good practice.

The explicit destructuring of SnapshotMergePlan aligns with the coding guidelines to prefer explicit field ignoring using : _ over ... This ensures new fields added to SnapshotMergePlan will require explicit handling here.

lib/collection/src/shards/replica_set/snapshots.rs (2)

5-7: LGTM!

Import updates correctly reference the new centralized location for RecoveryType and SnapshotManifest in the shard crate.


104-105: Good refactoring to centralize manifest loading.

Replacing the inline manifest loading logic with SnapshotManifest::load_from_snapshot simplifies this method and ensures consistent validation behavior across the codebase.

lib/shard/src/snapshots/snapshot_manifest.rs (3)

13-17: LGTM!

The SnapshotManifest struct with #[serde(transparent)] provides a clean wrapper that serializes directly as the inner HashMap, which is appropriate for manifest files.


119-134: LGTM!

The validation logic correctly ensures file versions don't exceed their segment version, providing a sensible consistency check.


177-191: LGTM!

The RecoveryType enum with is_full() and is_partial() helpers provides a clean API. Using matches! is idiomatic.

lib/shard/src/snapshots/snapshot_utils.rs (5)

18-27: LGTM!

The unpack_snapshot method correctly creates the target directory if needed, unpacks the archive, and restores the snapshot. The flow is clear and error handling is appropriate.


32-56: LGTM!

Good defensive programming to filter out hidden entries (e.g., .DS_Store) before restoring segments. The logging for ignored hidden segments aids debugging.


98-110: Explicit struct destructuring follows coding guidelines.

Good use of explicit field destructuring for ShardDataFiles, ensuring new fields will require explicit handling. As per coding guidelines, this pattern is preferred over ...


128-135: LGTM!

The SnapshotMergePlan struct clearly documents the categories of file operations needed for a merge, making the plan execution straightforward.


76-96: Handling of new segments is correctly managed through the directory merge operation.

The code properly handles segments that exist only in the snapshot. While the deletion loop (lines 76–96) iterates only over shard_manifest (existing local segments) to remove files/segments absent from the snapshot, new segments are handled by the merge_directories operation at line 112. This operation merges the entire snapshot segments directory into the shard segments directory using fs_extra::dir::move_dir with content_only(true).overwrite(true) (lib/common/io/src/move_files.rs), which recursively copies all files and subdirectories, including segments not present locally. The design cleanly separates concerns: deletions for segments in both manifests, and additions/updates via directory-level merge.

lib/collection/src/shards/local_shard/snapshot.rs (4)

1-1: LGTM!

Import updates correctly reference the new module locations for SegmentManifest, SnapshotManifest, and SnapshotUtils, as well as path constants from shard::files.

Also applies to: 11-11, 14-14, 18-19, 25-25


30-39: LGTM!

The snapshot_manifest method appropriately uses the search runtime to spawn a blocking task for reading segment manifests, avoiding blocking the async runtime. The ?? pattern correctly unwraps both the task join result and the inner operation result.


41-44: Good refactoring to use centralized utility.

Replacing the manual restore logic with SnapshotUtils::restore_unpacked_snapshot simplifies this method and ensures consistent behavior with other snapshot restoration code paths.


216-228: Good use of Cow for efficient manifest handling.

The Cow::Borrowed / Cow::Owned pattern efficiently avoids cloning when the segment manifest exists in the provided manifest, while creating an empty manifest for missing segments. This is a clean approach for optional borrowing.

Comment on lines +62 to +70
let shard = Shard::recover_partial_snapshot(
&shard_path,
&current_manifest,
unpack_dir.path(),
&snapshot_manifest,
)?;

self.0 = Some(shard);
Ok(())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Restore shard on partial recovery failure.

If Shard::recover_partial_snapshot fails at lines 62-67, the old shard (extracted at line 35) is dropped and self.0 remains None. This leaves the PyShard in an unusable state even though the underlying shard data should remain intact.

The error handling for current_manifest (lines 55-58) demonstrates the correct pattern by restoring the shard before returning the error. The same pattern should be applied here.

🔎 Proposed fix to restore shard on error
-        let shard = Shard::recover_partial_snapshot(
+        let new_shard = match Shard::recover_partial_snapshot(
             &shard_path,
             &current_manifest,
             unpack_dir.path(),
             &snapshot_manifest,
-        )?;
+        ) {
+            Ok(new_shard) => new_shard,
+            Err(err) => {
+                // Restore the original shard before returning
+                self.0 = Some(shard);
+                return Err(err);
+            }
+        };
 
-        self.0 = Some(shard);
+        self.0 = Some(new_shard);
         Ok(())
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let shard = Shard::recover_partial_snapshot(
&shard_path,
&current_manifest,
unpack_dir.path(),
&snapshot_manifest,
)?;
self.0 = Some(shard);
Ok(())
let new_shard = match Shard::recover_partial_snapshot(
&shard_path,
&current_manifest,
unpack_dir.path(),
&snapshot_manifest,
) {
Ok(new_shard) => new_shard,
Err(err) => {
// Restore the original shard before returning
self.0 = Some(shard);
return Err(err);
}
};
self.0 = Some(new_shard);
Ok(())
🤖 Prompt for AI Agents
In lib/edge/python/src/snapshots.rs around lines 62 to 70, the call to
Shard::recover_partial_snapshot can fail and currently drops the previously
extracted shard (self.0) leaving PyShard in a None/invalid state; change the
control flow to capture the error, restore the original shard into self.0 before
returning the Err (mirror the pattern used for current_manifest error handling
at lines 55-58), e.g. perform the recover_partial_snapshot call in a match or
map_err block that first assigns back the previously stored shard to self.0 and
then returns the error so the PyShard remains usable when partial recovery
fails.

@generall
Copy link
Member Author

generall commented Jan 4, 2026

ToDo in follow-up PRs:

  • remove support for rocksdb in partial snapshots
  • Unify snapshot recovery in edge and server versions (use SnapshotMergePlan)
  • make partial snapshots useful when modifying immutable segments (remove payload update in immutable segments)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
lib/edge/python/src/snapshots.rs (1)

62-72: Restore original shard when partial recovery fails.

The original shard is dropped at line 62, but if recover_partial_snapshot fails at lines 64-69, self.0 remains None, leaving PyShard in an unusable state. The error handling pattern at lines 55-58 demonstrates the correct approach: restore the shard before returning the error.

🔎 Proposed fix
-        drop(shard);
-
-        let new_shard = Shard::recover_partial_snapshot(
+        let new_shard = match Shard::recover_partial_snapshot(
             &shard_path,
             &current_manifest,
             unpack_dir.path(),
             &snapshot_manifest,
-        )?;
+        ) {
+            Ok(new_shard) => new_shard,
+            Err(err) => {
+                // Restore the original shard before returning
+                self.0 = Some(shard);
+                return Err(err);
+            }
+        };
 
         self.0 = Some(new_shard);
         Ok(())
🧹 Nitpick comments (2)
lib/edge/python/examples/download_partial.py (1)

55-60: Consider adding timeout to prevent indefinite hangs.

The requests.post call lacks a timeout parameter, which could cause the script to hang if the server is unresponsive. While this is an example script, adding a timeout would demonstrate better practices.

🔎 Suggested improvement
-response = requests.post(url, json=manifest, stream=True)
+response = requests.post(url, json=manifest, stream=True, timeout=30)
lib/collection/src/shards/replica_set/snapshots.rs (1)

191-191: Consider tracking this TODO as an issue.

The TODO comment indicates future work to replace the current implementation with partial_snapshot_merge_plan when RocksDB is removed. This is valuable context for future refactoring.

Would you like me to create an issue to track this future refactoring work?

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 61d8738 and 81388d1.

📒 Files selected for processing (5)
  • lib/collection/src/shards/replica_set/snapshots.rs
  • lib/edge/python/examples/download_partial.py
  • lib/edge/python/examples/restore-snapshot.py
  • lib/edge/python/src/snapshots.rs
  • lib/shard/src/snapshots/snapshot_utils.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/shard/src/snapshots/snapshot_utils.rs
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust
In new code, don't use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice - use bytemuck or zerocopy crates instead
Prefer explicit exhaustive matches over catch-all _ arm in Rust match expressions, so we don't miss handling new enum variants when they are added later (exceptions: tests and benchmarks, or when sure new variants won't affect logic)
Prefer explicit field ignoring using : _ over using .. in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)

Files:

  • lib/edge/python/src/snapshots.rs
  • lib/collection/src/shards/replica_set/snapshots.rs
🧠 Learnings (1)
📚 Learning: 2025-10-13T22:58:03.121Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7400
File: lib/segment/src/id_tracker/simple_id_tracker.rs:234-241
Timestamp: 2025-10-13T22:58:03.121Z
Learning: SimpleIdTracker in lib/segment/src/id_tracker/simple_id_tracker.rs is being deprecated and should not receive fixes related to version tracking or recovery logic, as it has a different version storage structure that is incompatible with newer trackers.

Applied to files:

  • lib/collection/src/shards/replica_set/snapshots.rs
🧬 Code graph analysis (3)
lib/edge/python/src/snapshots.rs (3)
lib/shard/src/files/mod.rs (2)
  • clear_data (57-83)
  • move_data (85-120)
lib/edge/src/snapshots.rs (3)
  • snapshot_manifest (15-17)
  • unpack_snapshot (11-13)
  • recover_partial_snapshot (19-64)
lib/shard/src/snapshots/snapshot_manifest.rs (1)
  • load_from_snapshot (24-117)
lib/edge/python/examples/restore-snapshot.py (3)
lib/shard/src/snapshots/snapshot_utils.rs (1)
  • unpack_snapshot (18-27)
lib/edge/src/snapshots.rs (2)
  • unpack_snapshot (11-13)
  • snapshot_manifest (15-17)
lib/edge/python/src/lib.rs (4)
  • unpack_snapshot (123-126)
  • retrieve (104-118)
  • snapshot_manifest (128-131)
  • update_from_snapshot (134-141)
lib/collection/src/shards/replica_set/snapshots.rs (1)
lib/shard/src/snapshots/snapshot_manifest.rs (1)
  • load_from_snapshot (24-117)
🪛 Ruff (0.14.10)
lib/edge/python/examples/download_partial.py

55-55: Probable use of requests call without timeout

(S113)

lib/edge/python/examples/restore-snapshot.py

8-8: from qdrant_edge import * used; unable to detect undefined names

(F403)


42-42: Shard may be undefined, or defined from star imports

(F405)


44-44: Shard may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: test-consistency
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: e2e-tests
  • GitHub Check: lint
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: Test Python bindings
🔇 Additional comments (4)
lib/edge/python/examples/restore-snapshot.py (1)

26-26: Good addition of timeout parameter.

The timeout parameter has been added, addressing the previous review concern about potential indefinite hangs.

lib/collection/src/shards/replica_set/snapshots.rs (3)

5-7: LGTM! Import changes support the refactoring.

The new imports appropriately support the centralization of snapshot manifest handling. Moving RecoveryType from a local enum to an imported type from the shard module improves code organization and reusability.


104-105: Excellent refactoring to centralize manifest loading.

Replacing manual manifest construction with SnapshotManifest::load_from_snapshot is a well-executed improvement. According to the documentation, this function handles recovery type validation, segment iteration, and manifest validation internally, which reduces code duplication and improves maintainability.


111-116: Good adherence to coding guidelines.

The exhaustive match on recovery_type follows the project's coding guidelines by using explicit arms for each variant (Full and Partial) rather than a catch-all _ pattern. This ensures the compiler will flag any missing cases if new variants are added to RecoveryType in the future.

@generall generall force-pushed the restore-snapshot-in-edge branch from 81388d1 to 651a37b Compare January 5, 2026 13:29
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Fix all issues with AI Agents 🤖
In @lib/edge/python/examples/download_partial.py:
- Line 55: Add a timeout to the requests.post call to avoid indefinite hangs:
update the line that creates response (response = requests.post(url,
json=manifest, stream=True)) to include a timeout (e.g., timeout=10 or
timeout=(5, 30)). Wrap the request in a try/except for
requests.exceptions.Timeout/requests.exceptions.RequestException to handle and
surface timeout errors gracefully.
♻️ Duplicate comments (3)
lib/shard/src/snapshots/snapshot_manifest.rs (1)

46-49: Replace .expect() with proper error handling.

Line 49 uses .expect() to extract the segment ID from the path. When processing external snapshot data, this can panic if the path ends with .. or if the filename isn't valid UTF-8. Defensive error handling is warranted here.

🔎 Proposed fix
             let segment_id = segment_path
                 .file_name()
                 .and_then(|segment_id| segment_id.to_str())
-                .expect("segment path ends with a valid segment id");
+                .ok_or_else(|| {
+                    OperationError::validation_error(format!(
+                        "invalid segment path: {}",
+                        segment_path.display(),
+                    ))
+                })?;
lib/edge/python/src/snapshots.rs (1)

62-72: Restore shard on partial recovery failure.

If Shard::recover_partial_snapshot fails at lines 64-69, the shard has already been dropped at line 62, leaving self.0 as None even though the underlying shard data remains intact. This makes the PyShard instance unusable after a failed partial recovery.

The error handling pattern at lines 55-58 demonstrates the correct approach. Apply the same pattern here to preserve usability on failure.

🔎 Proposed fix
-        drop(shard);
-
-        let new_shard = Shard::recover_partial_snapshot(
+        let new_shard = match Shard::recover_partial_snapshot(
             &shard_path,
             &current_manifest,
             unpack_dir.path(),
             &snapshot_manifest,
-        )?;
+        ) {
+            Ok(new_shard) => new_shard,
+            Err(err) => {
+                // Restore the original shard before returning
+                self.0 = Some(shard);
+                return Err(err);
+            }
+        };
+
+        drop(shard);
 
         self.0 = Some(new_shard);
         Ok(())
lib/edge/python/src/lib.rs (1)

128-131: Avoid .unwrap() on serialization—return a proper error instead.

As noted in the previous review, serde_json::to_value(&manifest).unwrap() can panic if serialization fails. Although SegmentManifest should always be serializable, using unwrap() in Python-facing code risks crashing the Python runtime.

🔎 Proposed fix
 pub fn snapshot_manifest(&self) -> Result<PyValue> {
     let manifest = self.get_shard()?.snapshot_manifest()?;
-    Ok(PyValue::new(serde_json::to_value(&manifest).unwrap()))
+    let value = serde_json::to_value(&manifest).map_err(|err| {
+        OperationError::service_error(format!("failed to serialize manifest: {err}"))
+    })?;
+    Ok(PyValue::new(value))
 }
🧹 Nitpick comments (2)
lib/common/io/src/move_files.rs (1)

20-22: Verify destination is a directory, not a file.

The code checks !to.exists() but doesn't verify that if to exists, it's actually a directory. If to is an existing file, fs::create_dir(&to) will be skipped, and fs_extra::dir::move_dir may fail with a confusing error.

🔎 Consider adding a directory check
 if !to.exists() {
     fs::create_dir(&to)?;
+} else if !to.is_dir() {
+    return Err(io::Error::new(
+        io::ErrorKind::InvalidInput,
+        format!("destination {} exists but is not a directory", to.display()),
+    ));
 }
lib/edge/python/examples/restore-snapshot.py (1)

8-8: Consider explicit imports for clarity.

The star import from qdrant_edge import * makes it unclear which symbols are being used and prevents static analysis from verifying correctness. While acceptable in example code, explicit imports improve readability:

from qdrant_edge import Shard
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81388d1 and 651a37b.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (41)
  • Cargo.toml
  • lib/collection/Cargo.toml
  • lib/collection/src/collection/snapshots.rs
  • lib/collection/src/common/file_utils.rs
  • lib/collection/src/shards/dummy_shard.rs
  • lib/collection/src/shards/forward_proxy_shard.rs
  • lib/collection/src/shards/local_shard/mod.rs
  • lib/collection/src/shards/local_shard/snapshot.rs
  • lib/collection/src/shards/proxy_shard.rs
  • lib/collection/src/shards/queue_proxy_shard.rs
  • lib/collection/src/shards/replica_set/snapshots.rs
  • lib/collection/src/shards/shard.rs
  • lib/collection/src/shards/shard_holder/mod.rs
  • lib/common/io/Cargo.toml
  • lib/common/io/src/lib.rs
  • lib/common/io/src/move_files.rs
  • lib/edge/python/Cargo.toml
  • lib/edge/python/examples/common.py
  • lib/edge/python/examples/download_partial.py
  • lib/edge/python/examples/restore-snapshot.py
  • lib/edge/python/src/lib.rs
  • lib/edge/python/src/snapshots.rs
  • lib/edge/python/src/types/value.rs
  • lib/edge/python/src/utils.rs
  • lib/edge/src/lib.rs
  • lib/edge/src/snapshots.rs
  • lib/segment/Cargo.toml
  • lib/segment/src/data_types/manifest.rs
  • lib/segment/src/entry/snapshot_entry.rs
  • lib/segment/src/segment/snapshot.rs
  • lib/shard/Cargo.toml
  • lib/shard/src/files/mod.rs
  • lib/shard/src/lib.rs
  • lib/shard/src/proxy_segment/snapshot_entry.rs
  • lib/shard/src/segment_holder/snapshot.rs
  • lib/shard/src/snapshots/mod.rs
  • lib/shard/src/snapshots/snapshot_manifest.rs
  • lib/shard/src/snapshots/snapshot_utils.rs
  • lib/storage/src/content_manager/snapshots/recover.rs
  • src/actix/api/snapshot_api.rs
  • src/common/snapshots.rs
💤 Files with no reviewable changes (1)
  • lib/segment/src/data_types/manifest.rs
🚧 Files skipped from review as they are similar to previous changes (16)
  • lib/edge/python/src/types/value.rs
  • lib/collection/src/shards/dummy_shard.rs
  • lib/shard/src/lib.rs
  • src/actix/api/snapshot_api.rs
  • src/common/snapshots.rs
  • lib/collection/src/shards/queue_proxy_shard.rs
  • lib/common/io/Cargo.toml
  • lib/shard/src/proxy_segment/snapshot_entry.rs
  • lib/segment/Cargo.toml
  • lib/shard/src/snapshots/mod.rs
  • lib/collection/src/shards/forward_proxy_shard.rs
  • lib/collection/src/shards/shard.rs
  • lib/collection/src/collection/snapshots.rs
  • lib/shard/Cargo.toml
  • lib/collection/src/common/file_utils.rs
  • Cargo.toml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust
In new code, don't use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice - use bytemuck or zerocopy crates instead
Prefer explicit exhaustive matches over catch-all _ arm in Rust match expressions, so we don't miss handling new enum variants when they are added later (exceptions: tests and benchmarks, or when sure new variants won't affect logic)
Prefer explicit field ignoring using : _ over using .. in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)

Files:

  • lib/storage/src/content_manager/snapshots/recover.rs
  • lib/collection/src/shards/proxy_shard.rs
  • lib/edge/python/src/utils.rs
  • lib/shard/src/segment_holder/snapshot.rs
  • lib/segment/src/entry/snapshot_entry.rs
  • lib/edge/src/lib.rs
  • lib/shard/src/snapshots/snapshot_manifest.rs
  • lib/shard/src/snapshots/snapshot_utils.rs
  • lib/common/io/src/move_files.rs
  • lib/collection/src/shards/local_shard/mod.rs
  • lib/collection/src/shards/replica_set/snapshots.rs
  • lib/collection/src/shards/shard_holder/mod.rs
  • lib/collection/src/shards/local_shard/snapshot.rs
  • lib/common/io/src/lib.rs
  • lib/segment/src/segment/snapshot.rs
  • lib/shard/src/files/mod.rs
  • lib/edge/python/src/lib.rs
  • lib/edge/src/snapshots.rs
  • lib/edge/python/src/snapshots.rs
🧠 Learnings (17)
📚 Learning: 2025-11-02T19:16:42.527Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7474
File: lib/collection/src/shards/transfer/stream_records.rs:46-55
Timestamp: 2025-11-02T19:16:42.527Z
Learning: In the Qdrant repository, do not flag compilation errors such as partial-move errors, type errors, or any syntax issues that would be caught by the Rust compiler during the build process.

Applied to files:

  • lib/shard/src/segment_holder/snapshot.rs
📚 Learning: 2025-04-22T23:17:41.734Z
Learnt from: xzfc
Repo: qdrant/qdrant PR: 6245
File: lib/segment/src/index/hnsw_index/point_scorer.rs:116-121
Timestamp: 2025-04-22T23:17:41.734Z
Learning: The method `is_none_or` exists in the Qdrant codebase and compiles correctly, despite not being part of standard Rust's Option type. Code reviews should verify compilation issues before reporting them.

Applied to files:

  • lib/shard/src/segment_holder/snapshot.rs
📚 Learning: 2025-11-25T11:56:33.176Z
Learnt from: CR
Repo: qdrant/qdrant PR: 0
File: .github/review-rules.md:0-0
Timestamp: 2025-11-25T11:56:33.176Z
Learning: Applies to **/*.rs : Prefer explicit field ignoring using `: _` over using `..` in Rust struct destructuring (exceptions: tests and benchmarks, or when sure new fields won't affect logic)

Applied to files:

  • lib/shard/src/segment_holder/snapshot.rs
📚 Learning: 2025-10-13T22:58:03.121Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7400
File: lib/segment/src/id_tracker/simple_id_tracker.rs:234-241
Timestamp: 2025-10-13T22:58:03.121Z
Learning: SimpleIdTracker in lib/segment/src/id_tracker/simple_id_tracker.rs is being deprecated and should not receive fixes related to version tracking or recovery logic, as it has a different version storage structure that is incompatible with newer trackers.

Applied to files:

  • lib/shard/src/segment_holder/snapshot.rs
  • lib/shard/src/snapshots/snapshot_manifest.rs
  • lib/collection/src/shards/shard_holder/mod.rs
  • lib/segment/src/segment/snapshot.rs
📚 Learning: 2025-08-15T11:42:00.297Z
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.

Applied to files:

  • lib/shard/src/segment_holder/snapshot.rs
📚 Learning: 2025-07-11T11:35:21.549Z
Learnt from: generall
Repo: qdrant/qdrant PR: 6854
File: lib/segment/src/index/query_estimator.rs:320-327
Timestamp: 2025-07-11T11:35:21.549Z
Learning: In test code for Qdrant's query estimator (lib/segment/src/index/query_estimator.rs), simplified ID resolution logic using `id.to_string().parse().unwrap()` is acceptable for testing purposes and doesn't need to match production code's `id_tracker.internal_id()` approach. Test code can use mock implementations that serve the testing goals.

Applied to files:

  • lib/shard/src/snapshots/snapshot_manifest.rs
  • lib/segment/src/segment/snapshot.rs
📚 Learning: 2025-08-18T10:56:43.707Z
Learnt from: IvanPleshkov
Repo: qdrant/qdrant PR: 7043
File: lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs:340-347
Timestamp: 2025-08-18T10:56:43.707Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_multivector_storage.rs, the create_offsets_file_from_iter function intentionally requires paths to have a parent directory and returns an error if path.parent() is None. This was a conscious design decision to ensure proper path validation.

Applied to files:

  • lib/shard/src/snapshots/snapshot_manifest.rs
📚 Learning: 2025-09-01T11:42:06.964Z
Learnt from: timvisee
Repo: qdrant/qdrant PR: 7157
File: lib/shard/src/segment_holder/mod.rs:808-814
Timestamp: 2025-09-01T11:42:06.964Z
Learning: In Qdrant's segment holder, panicking when no segments exist during flush_all is intentional and preferred over graceful error handling, as having zero segments could permanently corrupt the WAL by acknowledging u64::MAX. The maintainers consider this condition impossible and use the panic as a fail-fast safety mechanism to prevent data corruption.

Applied to files:

  • lib/shard/src/snapshots/snapshot_manifest.rs
  • lib/collection/src/shards/local_shard/snapshot.rs
📚 Learning: 2025-05-19T09:25:54.285Z
Learnt from: timvisee
Repo: qdrant/qdrant PR: 6546
File: lib/segment/src/index/field_index/full_text_index/text_index.rs:497-518
Timestamp: 2025-05-19T09:25:54.285Z
Learning: In the Qdrant project, using `.expect()` and `.unwrap()` is considered acceptable in test code since test failures should be loud and obvious, and proper error propagation adds unnecessary complexity to test code.

Applied to files:

  • lib/shard/src/snapshots/snapshot_manifest.rs
📚 Learning: 2025-11-25T11:56:33.176Z
Learnt from: CR
Repo: qdrant/qdrant PR: 0
File: .github/review-rules.md:0-0
Timestamp: 2025-11-25T11:56:33.176Z
Learning: Applies to **/*.rs : In new code, don't use `transmute_from_u8`, `transmute_to_u8`, `transmute_from_u8_to_slice`, `transmute_from_u8_to_mut_slice`, `transmute_to_u8_slice` - use `bytemuck` or `zerocopy` crates instead

Applied to files:

  • lib/common/io/src/move_files.rs
📚 Learning: 2025-08-10T18:30:02.986Z
Learnt from: generall
Repo: qdrant/qdrant PR: 7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.

Applied to files:

  • lib/collection/src/shards/local_shard/mod.rs
📚 Learning: 2025-06-02T18:10:47.203Z
Learnt from: coszio
Repo: qdrant/qdrant PR: 6609
File: lib/gridstore/src/blob.rs:46-59
Timestamp: 2025-06-02T18:10:47.203Z
Learning: In the Qdrant codebase, zerocopy crate is extensively used for safe byte-level operations across GPU operations, HNSW indices, memory-mapped structures, and serialization. When implementing Blob trait for generic Vec<T>, using zerocopy's FromBytes and IntoBytes traits is preferred over size_of::<T>() because it guarantees memory layout equals byte representation, making serialization safe and correct.

Applied to files:

  • lib/collection/src/shards/local_shard/snapshot.rs
📚 Learning: 2025-10-29T11:13:46.074Z
Learnt from: ffuugoo
Repo: qdrant/qdrant PR: 7465
File: lib/edge/python/src/types/value.rs:30-33
Timestamp: 2025-10-29T11:13:46.074Z
Learning: In lib/edge/python/src/types/value.rs, the HashMap transmute pattern in `PyValue::into_rust_map` that converts `HashMap<String, PyValue>` to `HashMap<String, serde_json::Value>` using unsafe transmute is acceptable and should not be flagged, as PyValue has transparent representation over serde_json::Value.

Applied to files:

  • lib/edge/python/src/lib.rs
📚 Learning: 2025-08-06T09:56:59.311Z
Learnt from: xzfc
Repo: qdrant/qdrant PR: 6982
File: lib/segment/src/index/hnsw_index/graph_links/view.rs:217-220
Timestamp: 2025-08-06T09:56:59.311Z
Learning: In Qdrant's GraphLinksView implementation (lib/segment/src/index/hnsw_index/graph_links/view.rs), methods like links() and links_with_vectors() intentionally use unwrap() and can panic for performance reasons, maintaining consistency across similar methods. Error handling improvements are considered secondary to feature completion and require benchmarking before implementation.

Applied to files:

  • lib/edge/python/src/lib.rs
📚 Learning: 2025-11-25T11:56:33.176Z
Learnt from: CR
Repo: qdrant/qdrant PR: 0
File: .github/review-rules.md:0-0
Timestamp: 2025-11-25T11:56:33.176Z
Learning: Don't report issues related to `.unwrap()` and `panic!()` usage in Rust tests and benchmarks

Applied to files:

  • lib/edge/python/src/lib.rs
📚 Learning: 2025-05-08T16:39:07.893Z
Learnt from: xzfc
Repo: qdrant/qdrant PR: 6501
File: lib/segment/src/index/hnsw_index/links_container.rs:158-158
Timestamp: 2025-05-08T16:39:07.893Z
Learning: In the Qdrant codebase, allocation failures are generally not handled (allowed to panic) except in very few specific cases. Most `unwrap()` calls on allocation operations are intentional.

Applied to files:

  • lib/edge/python/src/lib.rs
📚 Learning: 2025-08-22T13:49:39.261Z
Learnt from: xzfc
Repo: qdrant/qdrant PR: 7118
File: lib/segment/src/index/hnsw_index/graph_links/serializer.rs:50-54
Timestamp: 2025-08-22T13:49:39.261Z
Learning: In Qdrant's graph serialization code (lib/segment/src/index/hnsw_index/graph_links/serializer.rs), the team prefers to keep the code simple rather than adding defensive input validation for edge cases that don't occur in practice, relying on unit tests to catch any unexpected issues early.

Applied to files:

  • lib/edge/python/src/lib.rs
🧬 Code graph analysis (17)
lib/storage/src/content_manager/snapshots/recover.rs (1)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (20-29)
lib/collection/src/shards/proxy_shard.rs (8)
lib/edge/src/snapshots.rs (1)
  • snapshot_manifest (15-17)
lib/collection/src/shards/local_shard/snapshot.rs (1)
  • snapshot_manifest (30-39)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (20-29)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/collection/src/shards/forward_proxy_shard.rs (1)
  • snapshot_manifest (346-348)
lib/collection/src/shards/queue_proxy_shard.rs (1)
  • snapshot_manifest (157-162)
lib/collection/src/shards/shard.rs (1)
  • snapshot_manifest (166-174)
lib/edge/python/src/lib.rs (1)
  • snapshot_manifest (128-131)
lib/edge/python/src/utils.rs (2)
lib/collection/src/shards/shard_holder/mod.rs (1)
  • get_shard (358-360)
lib/edge/python/src/lib.rs (2)
  • from (150-152)
  • from (156-158)
lib/shard/src/segment_holder/snapshot.rs (8)
lib/edge/src/snapshots.rs (1)
  • snapshot_manifest (15-17)
lib/collection/src/shards/local_shard/snapshot.rs (1)
  • snapshot_manifest (30-39)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/collection/src/shards/forward_proxy_shard.rs (1)
  • snapshot_manifest (346-348)
lib/collection/src/shards/proxy_shard.rs (1)
  • snapshot_manifest (90-92)
lib/collection/src/shards/queue_proxy_shard.rs (1)
  • snapshot_manifest (157-162)
lib/collection/src/shards/shard.rs (1)
  • snapshot_manifest (166-174)
lib/edge/python/src/lib.rs (1)
  • snapshot_manifest (128-131)
lib/segment/src/entry/snapshot_entry.rs (2)
lib/segment/src/segment/snapshot.rs (2)
  • segment_id (30-49)
  • get_segment_manifest (116-118)
lib/shard/src/proxy_segment/snapshot_entry.rs (2)
  • segment_id (12-14)
  • get_segment_manifest (34-36)
lib/edge/src/lib.rs (1)
lib/shard/src/wal.rs (1)
  • path (187-189)
lib/shard/src/snapshots/snapshot_manifest.rs (3)
lib/shard/src/snapshots/snapshot_utils.rs (1)
  • fs (34-34)
lib/edge/src/snapshots.rs (1)
  • snapshot_manifest (15-17)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (20-29)
lib/shard/src/snapshots/snapshot_utils.rs (2)
lib/segment/src/common/validate_snapshot_archive.rs (2)
  • validate_snapshot_archive (15-48)
  • open_snapshot_archive (50-65)
lib/shard/src/files/mod.rs (1)
  • get_shard_data_files (41-48)
lib/common/io/src/move_files.rs (1)
lib/collection/src/common/file_utils.rs (3)
  • move_dir (20-79)
  • move_file (83-114)
  • cleanup_file (117-123)
lib/collection/src/shards/local_shard/mod.rs (1)
lib/shard/src/files/mod.rs (6)
  • get_shard_data_files (41-48)
  • check_data (51-55)
  • wal_path (26-28)
  • segments_path (21-23)
  • newest_clocks_path (31-33)
  • oldest_clocks_path (36-38)
lib/collection/src/shards/replica_set/snapshots.rs (1)
lib/shard/src/snapshots/snapshot_manifest.rs (1)
  • load_from_snapshot (24-117)
lib/collection/src/shards/shard_holder/mod.rs (9)
lib/edge/src/snapshots.rs (1)
  • snapshot_manifest (15-17)
lib/collection/src/shards/local_shard/snapshot.rs (1)
  • snapshot_manifest (30-39)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (20-29)
lib/collection/src/shards/dummy_shard.rs (1)
  • snapshot_manifest (53-55)
lib/collection/src/shards/forward_proxy_shard.rs (1)
  • snapshot_manifest (346-348)
lib/collection/src/shards/proxy_shard.rs (1)
  • snapshot_manifest (90-92)
lib/collection/src/shards/queue_proxy_shard.rs (1)
  • snapshot_manifest (157-162)
lib/collection/src/shards/shard.rs (1)
  • snapshot_manifest (166-174)
lib/edge/python/src/lib.rs (1)
  • snapshot_manifest (128-131)
lib/collection/src/shards/local_shard/snapshot.rs (4)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (20-29)
lib/shard/src/snapshots/snapshot_utils.rs (1)
  • restore_unpacked_snapshot (32-56)
lib/segment/src/entry/snapshot_entry.rs (1)
  • segment_id (11-11)
lib/segment/src/data_types/manifest.rs (1)
  • empty (14-20)
lib/segment/src/segment/snapshot.rs (1)
lib/segment/src/entry/snapshot_entry.rs (3)
  • segment_id (11-11)
  • take_snapshot (17-23)
  • get_segment_manifest (25-25)
lib/edge/src/snapshots.rs (2)
lib/shard/src/snapshots/snapshot_utils.rs (3)
  • fs (34-34)
  • unpack_snapshot (18-27)
  • partial_snapshot_merge_plan (62-130)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (20-29)
lib/edge/python/src/snapshots.rs (3)
lib/shard/src/files/mod.rs (2)
  • clear_data (57-83)
  • move_data (85-120)
lib/edge/src/snapshots.rs (3)
  • snapshot_manifest (15-17)
  • unpack_snapshot (11-13)
  • recover_partial_snapshot (19-64)
lib/shard/src/snapshots/snapshot_manifest.rs (1)
  • load_from_snapshot (24-117)
lib/edge/python/examples/restore-snapshot.py (5)
lib/shard/src/snapshots/snapshot_utils.rs (1)
  • unpack_snapshot (18-27)
lib/edge/src/snapshots.rs (2)
  • unpack_snapshot (11-13)
  • snapshot_manifest (15-17)
lib/edge/python/src/lib.rs (3)
  • unpack_snapshot (123-126)
  • retrieve (104-118)
  • snapshot_manifest (128-131)
lib/collection/src/shards/local_shard/snapshot.rs (1)
  • snapshot_manifest (30-39)
lib/shard/src/segment_holder/snapshot.rs (1)
  • snapshot_manifest (20-29)
🪛 Ruff (0.14.10)
lib/edge/python/examples/download_partial.py

55-55: Probable use of requests call without timeout

(S113)

lib/edge/python/examples/restore-snapshot.py

8-8: from qdrant_edge import * used; unable to detect undefined names

(F403)


42-42: Shard may be undefined, or defined from star imports

(F405)


44-44: Shard may be undefined, or defined from star imports

(F405)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: integration-tests-consensus
  • GitHub Check: test-consensus-compose
  • GitHub Check: e2e-tests
  • GitHub Check: test-consistency
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: lint
  • GitHub Check: integration-tests
  • GitHub Check: Test Python bindings
🔇 Additional comments (57)
lib/collection/Cargo.toml (1)

85-85: The workspace dependency refactor is correctly configured. The workspace root Cargo.toml includes fs_extra = "1.3.0" in [workspace.dependencies], and usage in lib/collection/src/common/file_utils.rs (fs_extra::dir::move_dir) is compatible with this version. No breaking changes are introduced.

lib/edge/python/Cargo.toml (1)

35-35: LGTM! Appropriate dependency for snapshot operations.

The addition of tempfile aligns well with the snapshot restoration functionality being introduced in this PR.

lib/common/io/src/lib.rs (1)

2-2: LGTM! Clean module export for file move utilities.

The new move_files module provides cross-filesystem-safe operations that support the snapshot restoration features in this PR.

lib/edge/python/examples/common.py (1)

7-7: LGTM! Good centralization of the data directory path.

Moving DATA_DIRECTORY to module level eliminates shadowing and makes the path consistently accessible throughout the module.

lib/collection/src/shards/local_shard/mod.rs (6)

51-51: LGTM! Centralized path utilities improve maintainability.

The import of shard::files utilities consolidates path management logic, reducing duplication across the codebase.


142-168: LGTM! Excellent refactoring with exhaustive destructuring.

The refactored move_data method properly:

  • Uses centralized get_shard_data_files for consistent path resolution
  • Employs exhaustive destructuring of ShardDataFiles (aligning with coding guidelines)
  • Conditionally moves clocks files only when they exist

This pattern ensures compile-time safety when new fields are added to ShardDataFiles.

Based on coding guidelines, exhaustive destructuring without .. ensures visibility when the struct evolves.


175-175: LGTM! Clean delegation to centralized check.


182-205: LGTM! Consistent use of centralized utilities.

The clear method follows the same pattern as move_data, using exhaustive destructuring and conditional file removal. This consistency improves maintainability.


495-500: LGTM! Path helpers correctly delegate to centralized utilities.


1165-1170: LGTM! Clocks path helpers updated consistently.

lib/collection/src/shards/local_shard/snapshot.rs (4)

1-1: LGTM: Import updates align with the refactoring.

The import path changes correctly reflect the new centralized snapshot manifest structure, and the addition of Cow supports the per-segment manifest optimization introduced later in the file.

Also applies to: 11-11, 14-14, 18-19, 25-25


30-39: LGTM: Well-structured async manifest retrieval.

The method correctly uses the search runtime context and proper cancellation handling for the blocking operation.


41-45: LGTM: Clean refactoring using centralized utilities.

Delegating to SnapshotUtils::restore_unpacked_snapshot simplifies the implementation and centralizes snapshot restoration logic.


216-228: Per-segment manifest extraction logic is correct.

The implementation properly handles both cases: segments present in the manifest (borrowed) and missing segments (owned empty manifest). The Cow optimization avoids unnecessary clones when the manifest contains the segment.

lib/common/io/src/move_files.rs (3)

38-60: LGTM: Robust cross-filesystem file move implementation.

The function correctly handles the fallback from rename to copy+remove, with proper cleanup on errors. The TOCTOU comment accurately describes the motivation for trying rename first.


63-70: LGTM: Appropriate cleanup logic.

The function correctly handles the common case where the file may not exist (e.g., if creation failed), while still propagating other errors.


78-111: LGTM: Good test coverage.

The test validates the key behavior of move_dir: preserving existing destination files while moving source contents, and properly removing the source directory.

lib/shard/src/files/mod.rs (4)

5-48: LGTM: Well-structured centralized file path utilities.

The module provides a clear, type-safe API for managing shard data file paths. The explicit field initialization in get_shard_data_files follows the coding guidelines.


50-55: LGTM: Appropriate data existence check.

The function correctly checks for the essential shard data (WAL and segments). Clock files are likely optional metadata and don't need to be checked for basic data presence validation.


57-83: Explicit field destructuring follows coding guidelines.

The function properly uses explicit field binding without .., as recommended in the coding guidelines. The early-return error handling is acceptable, though accumulating errors could improve cleanup completeness in case of partial failures.


85-120: LGTM: Explicit destructuring with proper move semantics.

The function correctly uses explicit field binding for both source and destination, following the coding guidelines. The move operations properly handle cross-filesystem moves via the new io::move_files utilities.

lib/shard/src/snapshots/snapshot_utils.rs (4)

18-27: LGTM: Clear unpack and restore flow.

The function properly sequences archive unpacking with in-place restoration. On error, the partially unpacked state remains, which allows for manual inspection/recovery and is acceptable for this utility function.


32-56: LGTM: Proper segment restoration with hidden file filtering.

The function correctly filters out hidden segments and restores each segment in place. The fail-fast error handling on restoration failures is appropriate, as attempting to continue with partial corruption could lead to worse issues.


62-131: LGTM: Comprehensive merge plan generation.

The function correctly identifies segments and files to delete, and plans directory merges/replacements appropriately. The decision to merge segments directories but replace the WAL directory aligns with partial snapshot semantics (segments can be partially updated, but WAL represents a new state).


133-140: LGTM: Clear separation of merge operation types.

The struct design appropriately categorizes different merge operations, allowing callers to execute them in the correct order (e.g., deletions before moves, replacements vs merges).

lib/storage/src/content_manager/snapshots/recover.rs (1)

14-14: LGTM: Import path updated for refactored snapshot manifest module.

The import path correctly reflects the relocation of RecoveryType to the centralized shard::snapshots::snapshot_manifest module.

lib/collection/src/shards/proxy_shard.rs (1)

20-20: LGTM! Import path updated to reflect refactored location.

The import path change aligns with the PR's broader refactor moving SnapshotManifest to the shard::snapshots::snapshot_manifest module.

lib/edge/src/lib.rs (3)

5-5: LGTM! Internal module addition.

The snapshots module is internal (not public), so it doesn't affect the public API surface.


29-29: LGTM! Field activation.

The field rename from _path to path (removing the underscore prefix) correctly signals that this field is now used, complementing the new public path() accessor.

Also applies to: 169-169


182-184: LGTM! Standard accessor pattern.

The public path() accessor follows Rust conventions and provides read-only access to the shard's path, consistent with similar accessors elsewhere in the codebase (e.g., lib/shard/src/wal.rs:186-188).

lib/collection/src/shards/shard_holder/mod.rs (1)

26-26: LGTM! Import paths updated.

The import path changes for SnapshotManifest and RecoveryType align with the PR's refactor to centralize these types under shard::snapshots::snapshot_manifest.

lib/edge/python/src/utils.rs (1)

1-16: LGTM! Clean utility helper.

This helper method provides centralized, safe access to the inner Shard from PyShard, with a clear error message when the shard is uninitialized. The pattern follows standard Rust Option unwrapping conventions.

lib/shard/src/segment_holder/snapshot.rs (2)

17-17: LGTM! Import path updated.

The import path change aligns with the PR's refactor to centralize SnapshotManifest under crate::snapshots::snapshot_manifest.


23-26: LGTM! Updated to new segment manifest API.

The change from collecting full snapshot manifests to retrieving individual segment manifests via get_segment_manifest() and aggregating them with manifest.add() follows the refactored API design. Error handling is correctly propagated.

lib/edge/python/src/snapshots.rs (3)

1-11: LGTM!

The imports are well-organized and all necessary for the snapshot restoration functionality.


13-24: LGTM!

The temporary directory determination logic provides sensible fallbacks, preferring locality to the snapshot file when possible.


41-51: Full recovery path implements expected behavior.

The full recovery correctly drops the old shard, clears existing data, moves new data into place, and reloads. Note that if Shard::load fails at line 46, self.0 will remain None even though the new data is on disk—this is a known trade-off from previous review discussions.

lib/edge/python/examples/restore-snapshot.py (2)

17-31: LGTM!

The download function now includes a timeout parameter, addressing the previous review concern. For example code, letting exceptions propagate is acceptable as it keeps the code simple and failures are obvious.


33-60: LGTM!

The example effectively demonstrates the snapshot restoration workflow, including both full recovery (unpack → load → retrieve) and partial updates (update_from_snapshot). The code is clear and follows best practices like cleaning up existing directories before unpacking.

lib/segment/src/entry/snapshot_entry.rs (1)

10-25: LGTM!

The trait changes align with the architectural shift to use SegmentManifest for per-segment snapshot operations and SnapshotManifest for shard-level aggregation. The new segment_id() and get_segment_manifest() methods provide necessary functionality for the segment-oriented snapshot API.

lib/edge/src/snapshots.rs (2)

11-17: LGTM!

The delegation methods provide a clean public API for snapshot operations on the Edge Shard type.


40-61: LGTM!

The filesystem operations execute the merge plan in a sensible order (moves before deletes) with appropriate existence checks for directory replacement.

lib/shard/src/snapshots/snapshot_manifest.rs (4)

1-17: LGTM!

The struct definition is clean with appropriate derives and uses serde(transparent) for efficient serialization of the segment manifest map.


58-84: LGTM!

The recovery type validation logic correctly distinguishes between full snapshots (no manifests) and partial snapshots (manifests required), with clear error messages guiding users to the correct endpoint.


86-110: LGTM!

The manifest loading logic includes proper error handling with contextual error messages for both I/O failures and deserialization errors, plus validation that the segment ID in the manifest matches the directory name.


119-191: LGTM!

The utility methods are well-implemented with appropriate validation logic. The validate() method ensures version consistency, add() properly handles version-based updates, and the RecoveryType enum uses explicit matches! patterns following the coding guidelines.

lib/collection/src/shards/replica_set/snapshots.rs (2)

105-106: LGTM: Manifest loading now uses centralized logic.

The change from manual manifest assembly to using SnapshotManifest::load_from_snapshot is a good refactoring that centralizes the manifest validation and loading logic.


193-193: Acknowledged: TODO aligns with PR objectives.

This TODO is consistent with the PR objectives, which note that unifying snapshot recovery using SnapshotMergePlan is planned for a follow-up PR after RocksDB removal.

lib/edge/python/src/lib.rs (4)

77-77: LGTM: Option wrapper pattern enables snapshot operations.

Wrapping the shard in Option<edge::Shard> is a reasonable pattern that allows temporarily taking ownership during operations like update_from_snapshot, where the shard may need to be replaced.

Also applies to: 84-84


88-88: LGTM: Consistent access pattern via get_shard().

Using get_shard()? provides uniform error handling across all operations and ensures the shard is initialized before use.

Also applies to: 93-93, 99-99, 111-111


122-126: LGTM: Clean static method for unpacking snapshots.

The unpack_snapshot static method provides a clean API for extracting snapshot data before loading a shard.


133-141: LGTM: Well-designed API for snapshot updates.

The update_from_snapshot method provides a clean Python-facing API with optional parameters, properly delegating to the internal implementation.

lib/segment/src/segment/snapshot.rs (5)

26-27: LGTM: Good use of named constant.

Introducing SEGMENT_MANIFEST_FILE_NAME as a constant improves maintainability and makes the manifest storage location explicit.


30-49: LGTM: Robust segment ID extraction with proper error handling.

The implementation correctly extracts the segment ID from the path and includes appropriate error handling and debug assertions for validation.


56-83: LGTM: Well-structured manifest handling in snapshot creation.

The refactored logic cleanly handles both full snapshots (no manifest) and partial snapshots (with manifest). The manifest serialization and file selection logic is correct and includes proper error handling.


85-91: LGTM: Clean file inclusion logic.

The include_if closure correctly implements the logic for selective file inclusion based on the manifest, with appropriate fallback to including all files when no manifest is provided.


122-207: LGTM: Comprehensive segment manifest collection.

The _get_segment_manifest implementation thoroughly collects all segment files with proper version tracking. The removal of .into() on Line 203 aligns with the coding guideline to prefer explicit types when they already match.


url = f"http://localhost:6333/collections/{collection_name}/shards/{shard_id}/snapshot/partial/create"

response = requests.post(url, json=manifest, stream=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add timeout parameter to prevent indefinite hangs.

The requests.post call lacks a timeout parameter, which could cause the script to hang indefinitely if the server is unresponsive. Even for example scripts, demonstrating best practices is important.

🔎 Proposed fix
-response = requests.post(url, json=manifest, stream=True)
+response = requests.post(url, json=manifest, stream=True, timeout=30)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
response = requests.post(url, json=manifest, stream=True)
response = requests.post(url, json=manifest, stream=True, timeout=30)
🧰 Tools
🪛 Ruff (0.14.10)

55-55: Probable use of requests call without timeout

(S113)

🤖 Prompt for AI Agents
In @lib/edge/python/examples/download_partial.py around line 55, Add a timeout
to the requests.post call to avoid indefinite hangs: update the line that
creates response (response = requests.post(url, json=manifest, stream=True)) to
include a timeout (e.g., timeout=10 or timeout=(5, 30)). Wrap the request in a
try/except for requests.exceptions.Timeout/requests.exceptions.RequestException
to handle and surface timeout errors gracefully.

@generall generall merged commit 9c5cec1 into dev Jan 5, 2026
15 checks passed
@generall generall deleted the restore-snapshot-in-edge branch January 5, 2026 18:33
coszio pushed a commit that referenced this pull request Jan 8, 2026
* Restore shard snapshot in Edge python bindings

* method to request snapshot manifest

* move snapshot manifest into Shard crate

* move shapshot manifest reading

* implement inplace update of the shard from snapshot

* fmt

* move shapshot-related functions again, into a dedicated struct

* fmt

* implement partial snapshot recovery for edge

* test for partial recoverying snapshot on edge
generall added a commit that referenced this pull request Feb 9, 2026
* Restore shard snapshot in Edge python bindings

* method to request snapshot manifest

* move snapshot manifest into Shard crate

* move shapshot manifest reading

* implement inplace update of the shard from snapshot

* fmt

* move shapshot-related functions again, into a dedicated struct

* fmt

* implement partial snapshot recovery for edge

* test for partial recoverying snapshot on edge
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants