Skip to content

Add CollectionInfo::warnings field#7293

Merged
xzfc merged 4 commits intodevfrom
configuration_status
Sep 29, 2025
Merged

Add CollectionInfo::warnings field#7293
xzfc merged 4 commits intodevfrom
configuration_status

Conversation

@xzfc
Copy link
Member

@xzfc xzfc commented Sep 23, 2025

This PR adds CollectionInfo::configuration_status that contains a warning when collection configuration is not consistent. Particularly it checks requirements for the hnsw_config.copy_vectors option (#7232): quantization is enabled; multivectors are disabled.

Also, this PR performs a refactoring change in trait DiffConfig: it swaps DiffConfig::update arguments (as I find config.update(diff) to be more natural than diff.update(&config)) and adds DiffConfig::update_opt helper. More changes in DiffConfig in a subsequent PR.

coderabbitai[bot]

This comment was marked as resolved.

Comment on lines +165 to +168
message ConfigurationStatus {
bool ok = 1;
string warning = 2;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: how about we rename warning to message? I'd argue ok already defines whether we are ok/warning. And, make the message optional?

}

if !messages.is_empty() {
messages.pop(); // trim the last newline
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not push each individual message in a Vec, and use messages.join("\n"). Popping the new line coming from the last write is a bit of a hack to me.

@generall
Copy link
Member

generall commented Sep 23, 2025

What about making configuration_status more general, so we could also reuse it for other warnings, if needed?

For example, calling it warnings and making it type Option<Vec<String>> to not limit ourselves with API definitions we won't be able to change later

@xzfc
Copy link
Member Author

xzfc commented Sep 24, 2025

Updated. Now CollectionInfo includes warnings: Vec<CollectionWarning> field.
For now, CollectionWarning has only one field: message: String.

@xzfc xzfc requested review from generall and timvisee September 24, 2025 13:38
@xzfc xzfc changed the title Add CollectionInfo::configuration_status field Add CollectionInfo::warnings field Sep 24, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 24, 2025

📝 Walkthrough

Walkthrough

Adds a new CollectionWarning type (message: string) and surfaces it across docs (Markdown/OpenAPI), Protobuf, gRPC Rust types, internal operations/types, and API conversions so CollectionInfo includes repeated warnings. Implements CollectionConfigInternal::get_warnings to produce warnings from config validations, adds Collection::print_warnings and invokes it after create/update/apply_config and metadata updates. Refactors DiffConfig trait and related impls to apply diffs via update/update_opt/to_diff and updates call sites. Centralizes HNSW merge behavior in optimizers/segment logic. Adjusts OptimizersConfig update call patterns. Adds tests asserting configuration warnings.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • generall
  • timvisee
  • agourlay

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "Add CollectionInfo::warnings field" accurately describes the main change in the changeset. The changeset introduces a new CollectionWarning type and adds a warnings field to CollectionInfo across multiple files including proto definitions, API types, and documentation. This is the central and most visible change from a developer's perspective, as it affects the public API surface of CollectionInfo. The title is concise, clear, and directly captures the primary modification without being overly technical or vague.
Description Check ✅ Passed The pull request description is clearly related to the changeset. It explains that the PR adds CollectionInfo::configuration_status (which became warnings field based on the PR objectives) to contain warnings about collection configuration inconsistencies, specifically checking requirements for the hnsw_config.copy_vectors option. The description also mentions the DiffConfig trait refactoring, which is evident throughout the changeset in files like config_diff.rs where the update method signature is changed and update_opt is added. While the description mentions "configuration_status" rather than "warnings," the PR objectives clarify this was renamed during review, and the core intent is accurately described.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch configuration_status

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Tip

🧪 Early access (models): enabled

We are currently testing Sonnet 4.5 code review models, which should lead to better review quality. However, this model may result in higher noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience.

Note:

  • Public repositories are always opted into early access features.
  • You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/collection/src/operations/config_diff.rs (1)

241-267: Bug: invalid From conversion used for max_optimization_threads (will not compile)

From::from cannot convert MaxOptimizationThreads into Option<usize> due to orphan rules. Use an explicit match.

Apply this diff:

 impl DiffConfig<OptimizersConfigDiff> for OptimizersConfig {
     fn update(&self, diff: OptimizersConfigDiff) -> CollectionResult<OptimizersConfig> {
         let OptimizersConfigDiff {
             deleted_threshold,
             vacuum_min_vector_number,
             default_segment_number,
             max_segment_size,
             #[expect(deprecated)]
             memmap_threshold,
             indexing_threshold,
             flush_interval_sec,
             max_optimization_threads,
         } = diff;

         Ok(OptimizersConfig {
             deleted_threshold: deleted_threshold.unwrap_or(self.deleted_threshold),
             vacuum_min_vector_number: vacuum_min_vector_number
                 .unwrap_or(self.vacuum_min_vector_number),
             default_segment_number: default_segment_number.unwrap_or(self.default_segment_number),
             max_segment_size: max_segment_size.or(self.max_segment_size),
             #[expect(deprecated)]
             memmap_threshold: memmap_threshold.or(self.memmap_threshold),
             indexing_threshold: indexing_threshold.or(self.indexing_threshold),
             flush_interval_sec: flush_interval_sec.unwrap_or(self.flush_interval_sec),
-            max_optimization_threads: max_optimization_threads
-                .map_or(self.max_optimization_threads, From::from),
+            max_optimization_threads: match max_optimization_threads {
+                Some(MaxOptimizationThreads::Threads(n)) => Some(n),
+                Some(MaxOptimizationThreads::Auto) => None,
+                None => self.max_optimization_threads,
+            },
         })
     }
 }
🧹 Nitpick comments (9)
docs/redoc/master/openapi.json (2)

6610-6617: Mark warnings as readOnly and add a minimal example

This is a response-only field; marking it readOnly clarifies usage and avoids accidental client sends. An example improves docs UX.

           "warnings": {
             "description": "Warnings related to the collection",
             "default": [],
+            "readOnly": true,
             "type": "array",
+            "minItems": 0,
+            "example": [
+              { "message": "HNSW copy_vectors requires quantization and no multivectors" }
+            ],
             "items": {
               "$ref": "#/components/schemas/CollectionWarning"
             }
           },

6685-6696: Tighten CollectionWarning schema; consider forward‑compat keys

Add additionalProperties: false for strictness and a schema example. Optionally consider a stable code (enum/string) in future for programmatic handling.

       "CollectionWarning": {
         "type": "object",
         "required": [
           "message"
         ],
         "properties": {
           "message": {
             "description": "Warning message",
             "type": "string"
           }
-        }
+        },
+        "additionalProperties": false,
+        "example": { "message": "HNSW copy_vectors requires quantization and no multivectors" }
       },
docs/grpc/docs.md (2)

691-700: Consider future-proofing CollectionWarning

If you anticipate programmatic handling by clients, consider adding a stable code enum (and optional details) alongside message to avoid parsing strings later.


27-27: Generated gRPC docs — exclude from markdownlint or fix the generator

tools/generate_grpc_docs.dockerfile (ENTRYPOINT: protoc --doc_opt=markdown,docs.md) and tools/generate_grpc_docs.sh / tools/generate_grpc_docs.dockerfile.sh (docker run mounting docs/grpc:/out) produce docs/grpc/docs.md; exclude docs/grpc from markdownlint or update the generator to emit MD007-compliant list indentation.

lib/collection/src/operations/types.rs (1)

231-235: Avoid emitting empty warnings arrays in JSON

Add skip_serializing_if = "Vec::is_empty" to reduce response noise when no warnings.

Apply this diff:

     /// Warnings related to the collection
-    #[serde(default)]
+    #[serde(default, skip_serializing_if = "Vec::is_empty")]
     pub warnings: Vec<CollectionWarning>,
tests/openapi/test_vector_config.py (1)

247-279: Ensure cleanup on failures and consider asserting explicit status

  • Wrap create/GET/assertions in try/finally to always drop the collection if an assertion fails.
  • If the API guarantees "warning" as the non-ok value, assert equality for stronger checks; otherwise, keep the current "not ok" check.

Apply this diff to ensure cleanup and, if desired, make the assertion stricter:

 def test_configuration_status(config_name, vector_config, expected_status):
     test_collection = f"test_{config_name}"
-    drop_collection(collection_name=test_collection)
-
-    # Create collection with specified configuration
-    base_vector_config = {"size": 4, "distance": "Dot"}
-    base_vector_config.update(vector_config)
-
-    response = request_with_validation(
-        api="/collections/{collection_name}",
-        method="PUT",
-        path_params={"collection_name": test_collection},
-        body={"vectors": {"test_vector": base_vector_config}},
-    )
-    assert response.ok
-
-    # Check configuration_status
-    response = request_with_validation(
-        api="/collections/{collection_name}",
-        method="GET",
-        path_params={"collection_name": test_collection},
-    )
-    assert response.ok
-
-    configuration_status = response.json()["result"]["configuration_status"]
-
-    if expected_status == "ok":
-        assert configuration_status == "ok"
-    else:
-        assert configuration_status != "ok"
-
-    drop_collection(collection_name=test_collection)
+    drop_collection(collection_name=test_collection)
+    try:
+        # Create collection with specified configuration
+        base_vector_config = {"size": 4, "distance": "Dot"}
+        base_vector_config.update(vector_config)
+
+        response = request_with_validation(
+            api="/collections/{collection_name}",
+            method="PUT",
+            path_params={"collection_name": test_collection},
+            body={"vectors": {"test_vector": base_vector_config}},
+        )
+        assert response.ok
+
+        # Check configuration_status
+        response = request_with_validation(
+            api="/collections/{collection_name}",
+            method="GET",
+            path_params={"collection_name": test_collection},
+        )
+        assert response.ok
+
+        configuration_status = response.json()["result"]["configuration_status"]
+
+        if expected_status == "ok":
+            assert configuration_status == "ok"
+        else:
+            # If API guarantees "warning", prefer strict equality:
+            # assert configuration_status == "warning"
+            assert configuration_status != "ok"
+    finally:
+        drop_collection(collection_name=test_collection)
lib/collection/src/config.rs (1)

279-311: Configuration warnings logic looks correct and targeted

  • Ensures copy_vectors implies quantization enabled.
  • Flags incompatibility with multivectors.
  • Per-vector evaluation via params_iter is appropriate.

Optionally, consider adding codes/enums to warnings to ease client-side handling, but strings are fine if API contracts prefer simplicity.

lib/collection/src/operations/config_diff.rs (2)

276-298: Avoid unwraps in From conversions (optional)

Using to_diff().unwrap() panics on serialization errors. Consider TryFrom for fallible conversions, keeping From only for infallible ones.


236-239: Unclear value of implementing DiffConfig for HnswConfigDiff

Implementing DiffConfig<HnswConfigDiff> for HnswConfigDiff seems unnecessary; consider removing unless used by generics.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ae0e4e4 and 6ca493f.

📒 Files selected for processing (16)
  • docs/grpc/docs.md (3 hunks)
  • docs/redoc/master/openapi.json (2 hunks)
  • lib/api/src/grpc/proto/collections.proto (2 hunks)
  • lib/api/src/grpc/qdrant.rs (2 hunks)
  • lib/collection/src/collection/collection_ops.rs (5 hunks)
  • lib/collection/src/collection/mod.rs (3 hunks)
  • lib/collection/src/collection/state_management.rs (1 hunks)
  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs (2 hunks)
  • lib/collection/src/collection_manager/optimizers/segment_optimizer.rs (1 hunks)
  • lib/collection/src/config.rs (3 hunks)
  • lib/collection/src/operations/config_diff.rs (8 hunks)
  • lib/collection/src/operations/conversions.rs (5 hunks)
  • lib/collection/src/operations/types.rs (4 hunks)
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs (1 hunks)
  • lib/storage/src/content_manager/toc/create_collection.rs (4 hunks)
  • tests/openapi/test_vector_config.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/storage/src/content_manager/toc/collection_meta_ops.rs
  • lib/collection/src/operations/types.rs
  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/collection/state_management.rs
  • lib/collection/src/collection_manager/optimizers/segment_optimizer.rs
  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs
  • lib/api/src/grpc/qdrant.rs
  • lib/collection/src/config.rs
  • lib/collection/src/collection/collection_ops.rs
  • lib/collection/src/operations/conversions.rs
  • lib/collection/src/collection/mod.rs
  • lib/collection/src/operations/config_diff.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/storage/src/content_manager/toc/collection_meta_ops.rs
  • lib/collection/src/operations/types.rs
  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/collection/state_management.rs
  • lib/collection/src/collection_manager/optimizers/segment_optimizer.rs
  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs
  • lib/api/src/grpc/qdrant.rs
  • lib/collection/src/config.rs
  • lib/collection/src/collection/collection_ops.rs
  • lib/collection/src/operations/conversions.rs
  • lib/collection/src/collection/mod.rs
  • lib/collection/src/operations/config_diff.rs
🧠 Learnings (2)
📚 Learning: 2025-09-16T21:39:09.353Z
Learnt from: xzfc
PR: qdrant/qdrant#7232
File: lib/shard/src/segment_holder/mod.rs:1088-1094
Timestamp: 2025-09-16T21:39:09.353Z
Learning: The hnsw_format_force and hnsw_with_vectors flags in HnswGlobalConfig primarily affect existing segment conversion during load/startup, not the initial format selection for newly created empty segments. For new segments, format decisions are made by GraphLinksFormatParam::recommended() based on runtime conditions. Therefore, using HnswGlobalConfig::default() for temporary/proxy segments in SegmentHolder::build_tmp_segment is acceptable.

Applied to files:

  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs
📚 Learning: 2025-09-16T19:14:17.614Z
Learnt from: generall
PR: qdrant/qdrant#7183
File: lib/api/src/grpc/qdrant.rs:4263-4273
Timestamp: 2025-09-16T19:14:17.614Z
Learning: In qdrant, lib/api/src/grpc/qdrant.rs is auto-generated by prost-build; do not edit it directly. Make changes in lib/api/src/grpc/proto/points.proto (e.g., add [deprecated=true], doc comments, or encoding options), then regenerate the Rust code.

Applied to files:

  • lib/api/src/grpc/qdrant.rs
🧬 Code graph analysis (3)
tests/openapi/test_vector_config.py (2)
tests/openapi/helpers/collection_setup.py (1)
  • drop_collection (8-14)
tests/openapi/helpers/helpers.py (1)
  • request_with_validation (41-116)
lib/collection/src/collection/collection_ops.rs (1)
lib/collection/src/collection/mod.rs (1)
  • name (341-343)
lib/collection/src/operations/conversions.rs (1)
lib/collection/src/operations/types.rs (16)
  • from (98-105)
  • from (199-221)
  • from (266-286)
  • from (651-653)
  • from (671-673)
  • from (1165-1170)
  • from (1174-1179)
  • from (1183-1233)
  • from (1237-1239)
  • from (1243-1248)
  • from (1252-1257)
  • from (1261-1266)
  • from (1270-1275)
  • from (1279-1284)
  • from (1288-1293)
  • from (1297-1302)
🪛 markdownlint-cli2 (0.18.1)
docs/grpc/docs.md

27-27: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: e2e-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: test-consistency
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: lint
🔇 Additional comments (29)
docs/grpc/docs.md (1)

612-612: CollectionInfo.warnings addition looks good

New repeated field is backward compatible and aligns with the new CollectionWarning type.

lib/api/src/grpc/proto/collections.proto (2)

165-168: CollectionWarning message: LGTM

Simple, clear, and matches the public surface in docs.


579-579: CollectionInfo.warnings: LGTM

Field number choice is safe; addition is backward compatible.

lib/collection/src/operations/types.rs (3)

129-137: CollectionWarning struct: LGTM

Derives and anonymization settings are appropriate.


255-255: Populate warnings from config: LGTM

Using collection_config.get_warnings() at construction time makes sense.


279-279: Carry warnings through ShardInfoInternal conversion: LGTM

Keeps API responses consistent across endpoints.

lib/api/src/grpc/qdrant.rs (2)

378-384: Don't hand-edit generated code — proto already defines CollectionWarning

lib/api/src/grpc/proto/collections.proto declares message CollectionWarning (lines 165–167) and CollectionInfo.warnings = 11 (line 579); remove the manual struct in lib/api/src/grpc/qdrant.rs and regenerate prost outputs. Add a proto doc comment in collections.proto if you want a human-facing description.


1243-1244: CollectionInfo.warnings — proto doc & API surface present, OK

  • Proto comment exists: lib/api/src/grpc/proto/collections.proto — repeated CollectionWarning warnings = 11; // Warnings related to the collection
  • OpenAPI includes the field and description: docs/redoc/master/openapi.json ("warnings": { "description": "Warnings related to the collection").
  • Wire-compatibility unchanged (new field, unique tag 11).
tests/openapi/test_vector_config.py (1)

224-246: Good parametrized coverage for copy_vectors constraints

Covers the key scenarios (no copy, valid, missing quantization, multivector). Looks solid.

lib/storage/src/content_manager/toc/create_collection.rs (4)

167-175: Switch to update_opt for base configs is correct

Using update_opt for WAL, optimizers, and HNSW merges diffs with storage defaults cleanly and avoids repetitive matches.


193-194: Correct DiffConfig direction for strict mode

Using default_config.update(diff)? matches the new API and avoids previously inverted semantics.


211-211: Propagate updated optimizer_config

Using the merged optimizer_config variable in CollectionConfigInternal is correct and keeps the effective config consistent.


252-253: Log warnings after creation

Calling collection.print_warnings().await? surfaces configuration issues early. Confirm we don’t log the same warnings again immediately in later steps to avoid duplicates.

lib/collection/src/collection_manager/optimizers/segment_optimizer.rs (1)

251-253: Use collection_hnsw.update_opt(param_hnsw) for effective per-vector config

This centralizes HNSW merging and keeps optimizer behavior consistent with the new DiffConfig semantics.

lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs (2)

116-121: Align mismatch target with new HNSW merge direction

Computing target_hnsw via self.hnsw_config.update_opt(vector-level override) matches the refactor and should yield correct mismatch detection.


580-581: Tests updated to reflect new merge order

Asserting HNSW configs using hnsw_config_collection.update(hnsw_config_vectorX)? matches the new direction. Good.

Also applies to: 585-586

lib/collection/src/collection/collection_ops.rs (2)

37-37: Consistent use of config.update(...) across params, HNSW, optimizers, and strict mode

Matches the new DiffConfig API and keeps update flows uniform.

Also applies to: 54-54, 107-107, 189-189


409-415: New print_warnings API

Simple and effective logging of config warnings by collection. Consider calling it after config updates where relevant (it appears wired in storage flows).

lib/collection/src/collection/mod.rs (1)

135-136: Apply optimizer overwrites via update(...) consistently

Using update(...) for effective_optimizers_config at creation, load, and query time keeps behavior consistent and future-proof with DiffConfig changes.

Also applies to: 256-258, 851-851

lib/collection/src/config.rs (1)

450-456: HNSW update types are correct — no change required

VectorParamsDiff.hnsw_config is Option, and HnswConfigDiff implements DiffConfig, so existing_hnsw.update(hnsw_diff)? returns an HnswConfigDiff and the assignment is type-correct.

lib/collection/src/operations/conversions.rs (6)

53-56: Publicly exposing CollectionWarning: LGTM

Import and surface area update looks correct.


383-384: End-to-end warnings propagation: LGTM

Destructuring internal warnings and mapping to gRPC with explicit From conversions is correct and follows the style guide.

Also applies to: 539-543


547-552: CollectionWarning -> gRPC conversion: LGTM

Simple and explicit mapping; no concerns.


554-559: gRPC -> CollectionWarning conversion: LGTM

Bidirectional mapping is symmetric and correct.


868-901: Parsing warnings in GetCollectionInfoResponse: LGTM

Gracefully handles missing config/optimizer status and preserves warnings round-trip.


510-514: Confirmed: impl From<Option> for api::grpc::qdrant::MaxOptimizationThreads exists
Implemented in lib/api/src/grpc/conversions.rs (impl From<Option> for MaxOptimizationThreads at ~line 2105).

lib/collection/src/operations/config_diff.rs (3)

23-49: DiffConfig API refactor: LGTM

New update/update_opt/to_diff surface is clear and consistent with call sites.


270-273: WalConfig/CollectionParams DiffConfig impls: LGTM

Default update/to_diff behavior is appropriate here.


424-424: Tests updated to new update signature: LGTM

Tests now call config.update(diff) consistently and cover edge cases.

Also applies to: 436-436, 453-453, 477-477, 486-486

Comment on lines +176 to +177
self.print_warnings().await?;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Don’t fail config application on warning logging

Warnings are non-critical; a failure in print_warnings() shouldn’t abort apply_config.

Apply this diff to make logging best-effort:

-        self.print_warnings().await?;
+        if let Err(err) = self.print_warnings().await {
+            log::warn!("Failed to print collection warnings for {}: {}", self.id, err);
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self.print_warnings().await?;
if let Err(err) = self.print_warnings().await {
log::warn!("Failed to print collection warnings for {}: {}", self.id, err);
}
🤖 Prompt for AI Agents
In lib/collection/src/collection/state_management.rs around lines 176-177, the
call self.print_warnings().await? currently propagates errors and can abort
apply_config on non-critical warning logging; change it to make warning printing
best-effort by awaiting and handling errors locally (e.g., if the future returns
Err, log or warn about the logging failure but do not return the error), so
apply_config continues even if print_warnings() fails.

Comment on lines +182 to +183
collection.print_warnings().await?;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Make printing warnings non-fatal during updates

Don’t fail UpdateCollection if printing warnings errors out.

Apply this diff:

-        collection.print_warnings().await?;
+        if let Err(err) = collection.print_warnings().await {
+            log::warn!(
+                "Failed to print collection warnings for {}: {}",
+                operation.collection_name,
+                err
+            );
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
collection.print_warnings().await?;
if let Err(err) = collection.print_warnings().await {
log::warn!(
"Failed to print collection warnings for {}: {}",
operation.collection_name,
err
);
}
🤖 Prompt for AI Agents
In lib/storage/src/content_manager/toc/collection_meta_ops.rs around lines 182
to 183, calling collection.print_warnings().await currently propagates any error
and fails UpdateCollection; change this so printing warnings is non-fatal by
awaiting the call and swallowing or logging the error instead of returning
it—e.g., await print_warnings, match the Result and log a warning on Err (or
assign to let _ = ...), but do not propagate the error so UpdateCollection
continues.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
docs/grpc/docs.md (2)

27-27: Fix markdown list indentation (MD007).

Indentation should be 2 spaces for nested list items.

-    - [CollectionWarning](#qdrant-CollectionWarning)
+  - [CollectionWarning](#qdrant-CollectionWarning)

691-700: Document the message field.

The description cell for message is empty. Consider adding a short description in the proto (it will propagate here), e.g., “Human‑readable warning message about collection configuration.”

lib/collection/src/operations/types.rs (1)

231-234: Optionally omit empty arrays in REST JSON.

To reduce noise, skip serializing empty warnings in REST responses.

     /// Warnings related to the collection
-    #[serde(default)]
+    #[serde(default, skip_serializing_if = "Vec::is_empty")]
     pub warnings: Vec<CollectionWarning>,
lib/api/src/grpc/qdrant.rs (1)

378-384: Add machine-readable structure to warnings (proto-level, not here).

This generated block looks fine. For future-proof clients, consider adding optional, stable fields (e.g., code, level, context) alongside message in the proto, so consumers don’t have to parse free-form text.

Apply in lib/api/src/grpc/proto/collections.proto (then re-generate):

 message CollectionWarning {
-  string message = 1;
+  // Human-readable message
+  string message = 1;
+  // Stable machine-readable identifier, e.g. "HNSW_COPY_VECTORS_REQUIRES_QUANT"
+  string code = 2;
+  // Optional severity indicator (INFO by default)
+  enum Level { INFO = 0; WARNING = 1; ERROR = 2; }
+  Level level = 3;
+  // Optional structured details (free-form key/value)
+  map<string, string> context = 4;
 }

Note: don’t modify this generated file directly; update the .proto and re-generate.

tests/openapi/test_vector_config.py (1)

247-274: Consider adding a case for collection-level quantization

Add a variant where quantization is set at collection level (not per-vector) with copy_vectors=true to assert zero warnings in that scenario.

lib/collection/src/config.rs (1)

279-311: Future-proof warnings with typed codes (optional)

Current free-form messages work, but adding a code field (e.g., enum) would enable stable client handling, localization, and dedup without parsing strings.

lib/collection/src/collection/collection_ops.rs (1)

409-415: Avoid logging loop when there are no warnings (tiny nit)

Fast-return if the list is empty to skip the loop.

 pub async fn print_warnings(&self) -> CollectionResult<()> {
-        let warnings = self.collection_config.read().await.get_warnings();
-        for warning in warnings {
+        let warnings = self.collection_config.read().await.get_warnings();
+        if warnings.is_empty() {
+            return Ok(());
+        }
+        for warning in warnings {
             log::warn!("Collection {}: {}", self.name(), warning.message);
         }
         Ok(())
     }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ca493f and 879b85f.

📒 Files selected for processing (12)
  • docs/grpc/docs.md (3 hunks)
  • docs/redoc/master/openapi.json (2 hunks)
  • lib/api/src/grpc/proto/collections.proto (2 hunks)
  • lib/api/src/grpc/qdrant.rs (2 hunks)
  • lib/collection/src/collection/collection_ops.rs (5 hunks)
  • lib/collection/src/collection/state_management.rs (1 hunks)
  • lib/collection/src/config.rs (3 hunks)
  • lib/collection/src/operations/conversions.rs (5 hunks)
  • lib/collection/src/operations/types.rs (4 hunks)
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs (1 hunks)
  • lib/storage/src/content_manager/toc/create_collection.rs (4 hunks)
  • tests/openapi/test_vector_config.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • lib/api/src/grpc/proto/collections.proto
  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/collection/state_management.rs
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs
  • docs/redoc/master/openapi.json
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/collection/src/operations/types.rs
  • lib/collection/src/operations/conversions.rs
  • lib/collection/src/config.rs
  • lib/collection/src/collection/collection_ops.rs
  • lib/api/src/grpc/qdrant.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/collection/src/operations/types.rs
  • lib/collection/src/operations/conversions.rs
  • lib/collection/src/config.rs
  • lib/collection/src/collection/collection_ops.rs
  • lib/api/src/grpc/qdrant.rs
🧠 Learnings (1)
📚 Learning: 2025-09-16T19:14:17.614Z
Learnt from: generall
PR: qdrant/qdrant#7183
File: lib/api/src/grpc/qdrant.rs:4263-4273
Timestamp: 2025-09-16T19:14:17.614Z
Learning: In qdrant, lib/api/src/grpc/qdrant.rs is auto-generated by prost-build; do not edit it directly. Make changes in lib/api/src/grpc/proto/points.proto (e.g., add [deprecated=true], doc comments, or encoding options), then regenerate the Rust code.

Applied to files:

  • lib/api/src/grpc/qdrant.rs
🧬 Code graph analysis (3)
lib/collection/src/operations/types.rs (1)
lib/segment/src/common/anonymize.rs (13)
  • anonymize (45-45)
  • anonymize (49-51)
  • anonymize (55-57)
  • anonymize (61-63)
  • anonymize (67-71)
  • anonymize (75-79)
  • anonymize (112-116)
  • anonymize (120-124)
  • anonymize (128-132)
  • anonymize (136-145)
  • anonymize (149-151)
  • anonymize (155-159)
  • anonymize (163-178)
lib/collection/src/operations/conversions.rs (1)
lib/collection/src/operations/types.rs (16)
  • from (98-105)
  • from (199-221)
  • from (266-286)
  • from (651-653)
  • from (671-673)
  • from (1165-1170)
  • from (1174-1179)
  • from (1183-1233)
  • from (1237-1239)
  • from (1243-1248)
  • from (1252-1257)
  • from (1261-1266)
  • from (1270-1275)
  • from (1279-1284)
  • from (1288-1293)
  • from (1297-1302)
tests/openapi/test_vector_config.py (2)
tests/openapi/helpers/collection_setup.py (1)
  • drop_collection (8-14)
tests/openapi/helpers/helpers.py (1)
  • request_with_validation (41-116)
🪛 markdownlint-cli2 (0.18.1)
docs/grpc/docs.md

27-27: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: test-consistency
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: e2e-tests
  • GitHub Check: integration-tests-consensus
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests
  • GitHub Check: lint
  • GitHub Check: storage-compat-test
🔇 Additional comments (12)
docs/grpc/docs.md (1)

612-612: CollectionInfo.warnings addition looks good.

Field type and wording are clear and consistent with the new message type.

lib/collection/src/operations/types.rs (3)

129-137: New CollectionWarning type: LGTM.

Derives and anonymization semantics are appropriate.


251-256: Initialization: LGTM.

Populating warnings from collection_config.get_warnings() on empty info is correct.


265-281: Shard -> CollectionInfo conversion: LGTM.

Propagating warnings via config.get_warnings() is consistent with the new API.

lib/api/src/grpc/qdrant.rs (1)

1243-1246: Approve — warnings field added and fully wired (conversions, OpenAPI, tests verified)

Proto defines CollectionWarning and repeated warnings = 11; Rust conversions map warnings (From impls and TryFrom in lib/collection/src/operations/conversions.rs); OpenAPI/docs include the field (docs/redoc/master/openapi.json) and tests reference it (tests/openapi/test_vector_config.py).

tests/openapi/test_vector_config.py (1)

224-246: Good parameterized coverage for copy_vectors constraints

The matrix covers the key cases (no copy, valid with quantization, missing quantization, and multivector conflict) and asserts response schema compliance.

lib/collection/src/config.rs (2)

279-311: get_warnings logic is clear and correct

Merges per-vector HNSW over collection defaults, checks both collection- and vector-level quantization, and flags multivector incompatibility. Nicely scoped.


452-452: Correct update direction for HNSW diffs

Switching to existing_hnsw.update(hnsw_diff)? aligns with the refactored DiffConfig API and prevents accidental overwrite semantics.

lib/collection/src/collection/collection_ops.rs (1)

37-38: Consistent use of new DiffConfig::update API

Applying diffs onto the existing config (rather than the reverse) is consistent across params, HNSW, optimizers, and strict mode updates.

Also applies to: 54-55, 107-108, 189-190

lib/collection/src/operations/conversions.rs (3)

539-543: Wire warnings through CollectionInfo → gRPC conversion

Carries internal warnings to the API contract using explicit From, matching the new schema.


547-559: Bidirectional conversions for CollectionWarning

Straightforward and explicit conversions; keeps the type surface consistent between internal and API.


869-904: Parse warnings from GetCollectionInfoResponse

Correctly reconstructs internal warnings from the gRPC response.

@xzfc xzfc force-pushed the configuration_status branch from 879b85f to 47cf1fc Compare September 25, 2025 01:11
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
lib/collection/src/collection/collection_ops.rs (1)

409-415: LGTM: centralized warning emission

Simple, side‑effect‑only logging. Consider keeping this infallible end‑to‑end (see call sites).

tests/openapi/test_vector_config.py (1)

224-274: Approve configuration warning tests

Parametrization covers expected warning counts and confirmed “max_sim” matches the OpenAPI schema. Optional: add assertions on warning message contents to catch regressions.

lib/collection/src/collection/mod.rs (1)

849-851: Accessor updated cleanly

Propagating errors from update here is appropriate; no need to switch to update_opt unless you want a best-effort fallback.

If you prefer a best-effort fallback without bubbling errors in this accessor, you could use update_opt on the read guard:

-            Ok(config.optimizer_config.update(optimizers_overwrite)?)
+            Ok(config.optimizer_config.update_opt(Some(optimizers_overwrite)))

Note this changes error handling semantics.

lib/collection/src/operations/config_diff.rs (2)

23-48: Trait refactor looks good; consider marking results as must-use

The API shape is clear and consistent. To prevent accidental dropping of returned updated configs, consider annotating update and update_opt with #[must_use].

-    fn update(&self, diff: Diff) -> CollectionResult<Self> {
+    #[must_use]
+    fn update(&self, diff: Diff) -> CollectionResult<Self> {
         update_config(self, diff)
     }
 
-    fn update_opt(&self, diff: Option<Diff>) -> Self {
+    #[must_use]
+    fn update_opt(&self, diff: Option<Diff>) -> Self {
         match diff {
             Some(diff) => self.update(diff).unwrap_or_else(|e| {
                 log::warn!("Failed to apply config diff: {e}");
                 self.clone()
             }),
             None => self.clone(),
         }
     }

276-298: Using to_diff().unwrap() inside From impls

Acceptable given serialization should not fail, but consider expect(...) for a clearer panic message if it ever does.

-        config.to_diff().unwrap()
+        config.to_diff().expect("converting full config to its diff form should be infallible")
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 879b85f and 47cf1fc.

📒 Files selected for processing (16)
  • docs/grpc/docs.md (3 hunks)
  • docs/redoc/master/openapi.json (2 hunks)
  • lib/api/src/grpc/proto/collections.proto (2 hunks)
  • lib/api/src/grpc/qdrant.rs (2 hunks)
  • lib/collection/src/collection/collection_ops.rs (5 hunks)
  • lib/collection/src/collection/mod.rs (3 hunks)
  • lib/collection/src/collection/state_management.rs (1 hunks)
  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs (2 hunks)
  • lib/collection/src/collection_manager/optimizers/segment_optimizer.rs (1 hunks)
  • lib/collection/src/config.rs (3 hunks)
  • lib/collection/src/operations/config_diff.rs (8 hunks)
  • lib/collection/src/operations/conversions.rs (5 hunks)
  • lib/collection/src/operations/types.rs (4 hunks)
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs (1 hunks)
  • lib/storage/src/content_manager/toc/create_collection.rs (4 hunks)
  • tests/openapi/test_vector_config.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs
  • lib/collection/src/operations/types.rs
  • lib/api/src/grpc/proto/collections.proto
  • lib/collection/src/operations/conversions.rs
  • lib/api/src/grpc/qdrant.rs
  • docs/redoc/master/openapi.json
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/collection/src/collection/state_management.rs
  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs
  • lib/collection/src/collection/mod.rs
  • lib/collection/src/config.rs
  • lib/collection/src/operations/config_diff.rs
  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/collection_manager/optimizers/segment_optimizer.rs
  • lib/collection/src/collection/collection_ops.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/collection/src/collection/state_management.rs
  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs
  • lib/collection/src/collection/mod.rs
  • lib/collection/src/config.rs
  • lib/collection/src/operations/config_diff.rs
  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/collection_manager/optimizers/segment_optimizer.rs
  • lib/collection/src/collection/collection_ops.rs
🧠 Learnings (1)
📚 Learning: 2025-09-16T21:39:09.353Z
Learnt from: xzfc
PR: qdrant/qdrant#7232
File: lib/shard/src/segment_holder/mod.rs:1088-1094
Timestamp: 2025-09-16T21:39:09.353Z
Learning: The hnsw_format_force and hnsw_with_vectors flags in HnswGlobalConfig primarily affect existing segment conversion during load/startup, not the initial format selection for newly created empty segments. For new segments, format decisions are made by GraphLinksFormatParam::recommended() based on runtime conditions. Therefore, using HnswGlobalConfig::default() for temporary/proxy segments in SegmentHolder::build_tmp_segment is acceptable.

Applied to files:

  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs
🧬 Code graph analysis (2)
tests/openapi/test_vector_config.py (2)
tests/openapi/helpers/collection_setup.py (1)
  • drop_collection (8-14)
tests/openapi/helpers/helpers.py (1)
  • request_with_validation (41-116)
lib/collection/src/collection/collection_ops.rs (2)
src/settings.rs (1)
  • log (347-352)
lib/collection/src/collection/mod.rs (1)
  • name (341-343)
🪛 markdownlint-cli2 (0.18.1)
docs/grpc/docs.md

27-27: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-consensus-compose
  • GitHub Check: e2e-tests
  • GitHub Check: test-consistency
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: storage-compat-test
  • GitHub Check: lint
🔇 Additional comments (23)
lib/collection/src/collection_manager/optimizers/segment_optimizer.rs (1)

256-258: LGTM: consistent HNSW merge via update_opt

Using collection_hnsw.update_opt(param_hnsw) correctly applies vector-level diffs over collection defaults and aligns with other call sites.

lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs (2)

115-121: LGTM: target HNSW computed with collection defaults + per‑vector diff

self.hnsw_config.update_opt(...) ensures mismatch checks compare against the effective config used during (re)build.


580-586: LGTM: tests reflect new update order

Assertions against hnsw_config_collection.update(diff).unwrap() match the new composition semantics.

lib/collection/src/collection/collection_ops.rs (4)

37-38: LGTM: apply params diffs on target object

Assigning config.params = config.params.update(params_diff)?; follows the new DiffConfig API cleanly.


54-55: LGTM: in-place HNSW update

Consistent use of update(...) on the current config.


107-108: LGTM: optimizer config update

Matches the diff-application pattern across the codebase.


189-193: LGTM: strict mode update

Updating the existing config instance avoids clobbering unrelated fields.

docs/grpc/docs.md (2)

612-613: LGTM: warnings field added to CollectionInfo

New warnings: [CollectionWarning] is documented and clearly described.


691-700: LGTM: CollectionWarning type documented

Minimal shape (message: string) keeps API future‑proof.

lib/collection/src/collection/state_management.rs (1)

176-177: Don’t fail apply_config on warning logging

Printing warnings is non‑critical. Make it best‑effort to avoid aborting config application.

-        self.print_warnings().await?;
+        if let Err(err) = self.print_warnings().await {
+            log::warn!("Failed to print collection warnings for {}: {}", self.id, err);
+        }
lib/storage/src/content_manager/toc/create_collection.rs (2)

167-175: LGTM: switch to update_opt for WAL/HNSW/optimizers

Cleaner merge of diffs with defaults. Matches the DiffConfig refactor.


193-194: LGTM: strict mode composition order

default_config.update(diff)? applies overrides over defaults as intended.

lib/collection/src/config.rs (2)

279-311: Warnings logic for copy_vectors is sound

Merges per-vector HNSW diffs over collection defaults correctly and flags both quantization absence and multivector incompatibility. Nice use of update_opt here.


452-455: Correct migration to new DiffConfig semantics

Switching to existing_hnsw.update(hnsw_diff)? is the right direction with the refactored API.

lib/collection/src/collection/mod.rs (2)

133-136: Optimizers overwrite application updated correctly

Using update(optimizers_overwrite)? on the base config aligns with the new DiffConfig API.


255-258: Load path uses new update API correctly

Maintains previous behavior while adopting the refactor. Good.

lib/collection/src/operations/config_diff.rs (7)

236-239: HnswConfig implements DiffConfig against HnswConfigDiff correctly

This enables both full and diff objects to support the same update flow.


240-268: OptimizersConfig update handles nuanced merge semantics

Field-wise merges preserve prior values when diff is None and respect deprecations. Good use of From::from for MaxOptimizationThreads.

Please confirm From for Option (or equivalent) exists; otherwise this line will fail to compile:

max_optimization_threads.map_or(self.max_optimization_threads, From::from)

424-429: Tests updated to new update() flow

Assertions still verify intended semantics.


435-437: HNSW update test migrated correctly

Covers the basic field override path.


451-455: Optimizer update test migrated correctly

Validates indexing_threshold merge behavior.


476-480: Optimizer threads cases cover number/auto/null/nothing

Good coverage of deserialization and merge semantics.


485-488: WAL config test migrated correctly

Verifies single-field merge.

- [CollectionOperationResponse](#qdrant-CollectionOperationResponse)
- [CollectionParams](#qdrant-CollectionParams)
- [CollectionParamsDiff](#qdrant-CollectionParamsDiff)
- [CollectionWarning](#qdrant-CollectionWarning)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

🧩 Analysis chain

Fix markdown list indentation (MD007)

Indentation in the ToC item for CollectionWarning violates MD007 (expected 2, got 4). Either adjust indentation or disable MD007 for generated docs.

If this file is generated, prefer disabling the rule for this path in your lint config to avoid churn.


🏁 Script executed:

#!/bin/bash
# Show if this file is generated
sed -n '1,5p' docs/grpc/docs.md
# Show the problematic line
sed -n '27p' docs/grpc/docs.md

Length of output: 119


Fix markdown list indentation (MD007)
docs/grpc/docs.md line 27: indentation of the ToC entry for CollectionWarning is 4 spaces; reduce to 2.

🧰 Tools
🪛 markdownlint-cli2 (0.18.1)

27-27: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

🤖 Prompt for AI Agents
In docs/grpc/docs.md around line 27, the ToC entry for CollectionWarning is
indented with 4 spaces causing MD007 lint failure; change the indentation to 2
spaces for that single list item so the markdown list aligns with the
surrounding entries (replace the 4-space indent with a 2-space indent on line
27).

Comment on lines +252 to +253
collection.print_warnings().await?;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Don’t fail collection creation on warning logging

Warning emission should not block collection creation.

-        collection.print_warnings().await?;
+        if let Err(err) = collection.print_warnings().await {
+            log::warn!(
+                "Failed to print collection warnings for {}: {}",
+                collection_name,
+                err
+            );
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
collection.print_warnings().await?;
if let Err(err) = collection.print_warnings().await {
log::warn!(
"Failed to print collection warnings for {}: {}",
collection_name,
err
);
}
🤖 Prompt for AI Agents
In lib/storage/src/content_manager/toc/create_collection.rs around lines
252-253, the call to collection.print_warnings().await currently uses the ?
operator which causes collection creation to fail when only warning logging
fails; change this so warning emission is non-fatal by catching any error from
print_warnings().await and logging it (e.g., with warn! or processLogger
equivalent) instead of returning Err — allow the function to continue and
succeed even if printing warnings fails.

@xzfc xzfc force-pushed the configuration_status branch from 47cf1fc to d8d2038 Compare September 25, 2025 09:05
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
lib/collection/src/collection_manager/optimizers/segment_optimizer.rs (1)

252-268: Avoid double get_params lookup per vector.

Minor micro‑opt: fetch params once and reuse for both HNSW and quantization.

Apply this diff within the closure:

-                let param_hnsw = collection_params
-                    .vectors
-                    .get_params(vector_name)
-                    .and_then(|params| params.hnsw_config);
-                let vector_hnsw = collection_hnsw.update_opt(param_hnsw);
+                let params = collection_params.vectors.get_params(vector_name);
+                let vector_hnsw = collection_hnsw.update_opt(params.and_then(|p| p.hnsw_config));
                 config.index = Indexes::Hnsw(vector_hnsw);

                 // Assign quantization config
-                let param_quantization = collection_params
-                    .vectors
-                    .get_params(vector_name)
-                    .and_then(|params| params.quantization_config.as_ref());
-                let vector_quantization = param_quantization
+                let vector_quantization = params
+                    .and_then(|p| p.quantization_config.as_ref())
                     .or(collection_quantization.as_ref())
                     .cloned();
                 config.quantization_config = vector_quantization;
lib/collection/src/operations/types.rs (1)

232-235: Optionally omit empty warnings in REST responses

To reduce payload noise, consider skipping serialization when there are no warnings.

-    /// Warnings related to the collection
-    #[serde(default)]
-    pub warnings: Vec<CollectionWarning>,
+    /// Warnings related to the collection
+    #[serde(default, skip_serializing_if = "Vec::is_empty")]
+    pub warnings: Vec<CollectionWarning>,
lib/collection/src/collection/mod.rs (1)

256-258: Align error handling style (optional)

Load path uses expect(...) while new() propagates errors. Consider propagating here too for consistency, unless panicking on invalid overwrite during load is intentional.

lib/collection/src/operations/config_diff.rs (3)

35-43: update_opt silently swallows errors; consider surfacing or metering failures

Returning self on error is convenient but can hide misconfigurations. At minimum, emit a counter/metric for failed applications; alternatively, add an update_opt_with(handler) variant to let callers react.


276-298: Avoid unwrap() in From<...> → ...Diff conversions

unwrap() in From impls can panic; prefer expect() with context or provide TryFrom impls and migrate call sites.

Apply this diff for clearer failure context:

 impl From<HnswConfig> for HnswConfigDiff {
     fn from(config: HnswConfig) -> Self {
-        config.to_diff().unwrap()
+        config
+            .to_diff()
+            .expect("HnswConfig::to_diff() should be infallible for compatible schema")
     }
 }
 
 impl From<OptimizersConfig> for OptimizersConfigDiff {
     fn from(config: OptimizersConfig) -> Self {
-        config.to_diff().unwrap()
+        config
+            .to_diff()
+            .expect("OptimizersConfig::to_diff() should be infallible for compatible schema")
     }
 }
 
 impl From<WalConfig> for WalConfigDiff {
     fn from(config: WalConfig) -> Self {
-        config.to_diff().unwrap()
+        config
+            .to_diff()
+            .expect("WalConfig::to_diff() should be infallible for compatible schema")
     }
 }
 
 impl From<CollectionParams> for CollectionParamsDiff {
     fn from(config: CollectionParams) -> Self {
-        config.to_diff().unwrap()
+        config
+            .to_diff()
+            .expect("CollectionParams::to_diff() should be infallible for compatible schema")
     }
 }

433-436: Tests: HNSW update uses new API correctly

Covers the basic path; consider adding a negative test (invalid ef_construct) to assert error propagation.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 47cf1fc and d8d2038.

📒 Files selected for processing (16)
  • docs/grpc/docs.md (3 hunks)
  • docs/redoc/master/openapi.json (2 hunks)
  • lib/api/src/grpc/proto/collections.proto (2 hunks)
  • lib/api/src/grpc/qdrant.rs (2 hunks)
  • lib/collection/src/collection/collection_ops.rs (5 hunks)
  • lib/collection/src/collection/mod.rs (3 hunks)
  • lib/collection/src/collection/state_management.rs (1 hunks)
  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs (2 hunks)
  • lib/collection/src/collection_manager/optimizers/segment_optimizer.rs (1 hunks)
  • lib/collection/src/config.rs (3 hunks)
  • lib/collection/src/operations/config_diff.rs (8 hunks)
  • lib/collection/src/operations/conversions.rs (5 hunks)
  • lib/collection/src/operations/types.rs (4 hunks)
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs (1 hunks)
  • lib/storage/src/content_manager/toc/create_collection.rs (4 hunks)
  • tests/openapi/test_vector_config.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • lib/api/src/grpc/proto/collections.proto
  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs
  • lib/collection/src/collection/collection_ops.rs
  • tests/openapi/test_vector_config.py
  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/collection/state_management.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/collection/src/operations/types.rs
  • lib/collection/src/collection_manager/optimizers/segment_optimizer.rs
  • lib/collection/src/operations/config_diff.rs
  • lib/collection/src/collection/mod.rs
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs
  • lib/collection/src/config.rs
  • lib/collection/src/operations/conversions.rs
  • lib/api/src/grpc/qdrant.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/collection/src/operations/types.rs
  • lib/collection/src/collection_manager/optimizers/segment_optimizer.rs
  • lib/collection/src/operations/config_diff.rs
  • lib/collection/src/collection/mod.rs
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs
  • lib/collection/src/config.rs
  • lib/collection/src/operations/conversions.rs
  • lib/api/src/grpc/qdrant.rs
🧠 Learnings (1)
📚 Learning: 2025-09-16T19:14:17.614Z
Learnt from: generall
PR: qdrant/qdrant#7183
File: lib/api/src/grpc/qdrant.rs:4263-4273
Timestamp: 2025-09-16T19:14:17.614Z
Learning: In qdrant, lib/api/src/grpc/qdrant.rs is auto-generated by prost-build; do not edit it directly. Make changes in lib/api/src/grpc/proto/points.proto (e.g., add [deprecated=true], doc comments, or encoding options), then regenerate the Rust code.

Applied to files:

  • lib/api/src/grpc/qdrant.rs
🧬 Code graph analysis (2)
lib/collection/src/operations/types.rs (1)
lib/segment/src/common/anonymize.rs (13)
  • anonymize (45-45)
  • anonymize (49-51)
  • anonymize (55-57)
  • anonymize (61-63)
  • anonymize (67-71)
  • anonymize (75-79)
  • anonymize (112-116)
  • anonymize (120-124)
  • anonymize (128-132)
  • anonymize (136-145)
  • anonymize (149-151)
  • anonymize (155-159)
  • anonymize (163-178)
lib/collection/src/operations/conversions.rs (1)
lib/collection/src/operations/types.rs (16)
  • from (99-106)
  • from (200-222)
  • from (267-287)
  • from (652-654)
  • from (672-674)
  • from (1166-1171)
  • from (1175-1180)
  • from (1184-1234)
  • from (1238-1240)
  • from (1244-1249)
  • from (1253-1258)
  • from (1262-1267)
  • from (1271-1276)
  • from (1280-1285)
  • from (1289-1294)
  • from (1298-1303)
🪛 markdownlint-cli2 (0.18.1)
docs/grpc/docs.md

27-27: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: lint
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: e2e-tests
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-consistency
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
🔇 Additional comments (29)
lib/collection/src/collection_manager/optimizers/segment_optimizer.rs (1)

252-258: Good centralization via update_opt for HNSW merge.

This cleanly applies per‑vector HNSW diffs over collection defaults. Please confirm HnswConfig::update_opt(None) returns an unchanged clone of the base config to preserve previous fallback behavior.

lib/api/src/grpc/qdrant.rs (2)

378-384: New CollectionWarning message looks good

Definition and derives align with the rest of the generated API surface.


1243-1245: Adding warnings to CollectionInfo is consistent

Repeated field at tag 11 matches the proto change and will default to an empty list when absent.

lib/storage/src/content_manager/toc/collection_meta_ops.rs (1)

182-183: Make warning emission best-effort

Failing the whole update because print_warnings() can’t flush logs/IO is too aggressive; the update should still succeed even if we can’t emit warnings.

-        collection.print_warnings().await?;
+        if let Err(err) = collection.print_warnings().await {
+            log::warn!(
+                "Failed to print collection warnings for {}: {}",
+                operation.collection_name,
+                err
+            );
+        }
docs/redoc/master/openapi.json (2)

6610-6617: Docs update matches API change

OpenAPI now exposes warnings exactly as implemented in the backend—thanks for keeping the contract and docs in sync.


6686-6695: New schema definition looks correct

CollectionWarning mirrors the API payload shape and keeps room for future fields. No issues here.

lib/collection/src/operations/types.rs (3)

130-138: New CollectionWarning type looks good

Derives and anonymization settings are appropriate for user-visible messages.


256-257: Initialize warnings from config

Good: ensures CollectionInfo always reflects current config issues out of the box.


280-281: Populate warnings in shard-to-collection conversion

Correctly sources warnings from the effective config.

lib/collection/src/collection/mod.rs (2)

135-136: Adopt new DiffConfig::update usage correctly

Applying optimizers_overwrite via effective_optimizers_config.update(...) is correct with the refactored API.


850-851: Effective config accessor updated correctly

Returns the overwrite-applied config via the new update API.

lib/collection/src/config.rs (3)

25-27: Importing CollectionWarning for warnings plumbing

Looks correct and localizes the warning type to ops types.


279-311: Config warnings logic is solid

  • Merges per-vector HNSW with collection HNSW via update_opt before checks.
  • Correctly requires quantization and rejects multivectors when copy_vectors is enabled.
  • Message formatting avoids newlines with \ continuation, nice.

No changes requested.


450-456: Apply HNSW diffs onto existing config (direction fix)

Switching to existing_hnsw.update(hnsw_diff)? is the right semantics with the new DiffConfig API.

docs/grpc/docs.md (3)

27-27: Fix markdown list indentation (MD007) or disable for generated docs

Indentation of the ToC entry for CollectionWarning violates MD007 (expected 2, got 4). If this file is generated, prefer disabling the rule for this path to avoid churn.

-    - [CollectionWarning](#qdrant-CollectionWarning)
+  - [CollectionWarning](#qdrant-CollectionWarning)

612-613: CollectionInfo.warnings doc entry added correctly

Repeated field and description align with API changes.


691-700: CollectionWarning section looks correct

Type and field docs match implementation.

lib/collection/src/operations/conversions.rs (5)

53-56: LGTM: new CollectionWarning import is correct

Brings the internal type into scope for bi‑directional conversions.


378-384: Propagate warnings in CollectionInfo → gRPC: mapping looks correct

Destructuring and mapping each internal warning to its gRPC counterpart is sound.

Please add/confirm a round‑trip test asserting that:

  • internal CollectionInfo with warnings serializes to gRPC and back without loss or reordering.

Also applies to: 539-543


547-552: LGTM: From (internal → gRPC)

Explicit field mapping matches the proto.


554-559: LGTM: FromgRPC::CollectionWarning (gRPC → internal)

Symmetric conversion; future fields will surface as compile‑time breaks, which is desirable.


860-870: LGTM: warnings extracted from GetCollectionInfoResponse

Correctly maps repeated warnings from API to internal model.

Consider adding a test to cover:

  • Empty warnings array deserializes to an empty Vec (not None).
  • Multiple warnings preserve order.

Also applies to: 899-900

lib/collection/src/operations/config_diff.rs (7)

23-48: New DiffConfig API (update/update_opt/to_diff) is coherent

Switching to config.update(diff) improves ergonomics; bounds look appropriate.


241-267: LGTM: OptimizersConfig.update merges fields correctly

  • Scalar fields use unwrap_or(self.*)
  • Option fields use or(self.*)
  • max_optimization_threads handles “auto” via From mapping

270-273: LGTM: DiffConfig impls for WalConfig and CollectionParams

Keeps behavior consistent via the default JSON-merge path.


424-429: Tests: updated API usage is correct

Switch to params.update(diff) matches the new trait and keeps assertions intact.


451-455: Tests: Optimizers update path is valid

Good coverage for single-field diff.


476-480: Tests: max_optimization_threads cases are clear

Nice coverage for number/auto/null/invalid variants.


484-488: Tests: WAL update path is valid

Confirms default impl works with JSON merge.

#[serde(rename_all = "snake_case")]
pub struct CollectionWarning {
/// Warning message
#[anonymize(false)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Message can contain names of the vectors, I would rather anonymise it.

/// Status of optimizers
pub optimizer_status: OptimizersStatus,
/// Warnings related to the collection
#[serde(default)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's skip serialization if there are no warning. Otherwise it might be unnecessary breaking change for old clients

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Do clients break on unknown/new fields?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do clients break on unknown/new fields?

maybe. Some old versions

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added skip, tho I believe this approach is a fragile workaround.


  • It seems that the python client ≤v1.7.3 will crash producing this error message:
      File "pydantic/main.py", line 347, in pydantic.main.BaseModel.__init__
    pydantic.error_wrappers.ValidationError: 4 validation errors for ParsingModel[InlineResponse2005] (for parse_as_type)
    obj -> result -> config -> optimizer_config -> max_optimization_threads
      none is not an allowed value (type=type_error.none.not_allowed)
    obj -> result -> config -> wal_config -> wal_retain_closed
      extra fields not permitted (type=value_error.extra)
    obj -> result -> config -> strict_mode_config
      extra fields not permitted (type=value_error.extra)
    obj -> result -> warnings
      extra fields not permitted (type=value_error.extra)
    
  • Although it will still crash regardless of this particular new filed.
  • Doesn't crash starting from ≥v1.8.0. (properly ignores unknown fields)
  • I didn't check other language clients.

@xzfc xzfc force-pushed the configuration_status branch from d8d2038 to 6d2b4ac Compare September 29, 2025 12:10
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
lib/collection/src/collection/collection_ops.rs (1)

409-415: Make warning emission infallible (or document non-fatal semantics)

print_warnings only logs; consider returning () and guaranteeing no Err, or keep Result but ensure callers never fail on its error.

lib/collection/src/operations/types.rs (1)

233-234: Consider skipping serialization when warnings are empty.

Based on past review feedback, consider adding #[serde(skip_serializing_if = "Vec::is_empty")] to avoid potential breaking changes for older clients that might not expect this field.

     /// Warnings related to the collection
-    #[serde(default)]
+    #[serde(default, skip_serializing_if = "Vec::is_empty")]
     pub warnings: Vec<CollectionWarning>,
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d8d2038 and 6d2b4ac.

📒 Files selected for processing (12)
  • docs/grpc/docs.md (3 hunks)
  • docs/redoc/master/openapi.json (2 hunks)
  • lib/api/src/grpc/proto/collections.proto (2 hunks)
  • lib/api/src/grpc/qdrant.rs (2 hunks)
  • lib/collection/src/collection/collection_ops.rs (5 hunks)
  • lib/collection/src/collection/state_management.rs (1 hunks)
  • lib/collection/src/config.rs (3 hunks)
  • lib/collection/src/operations/conversions.rs (5 hunks)
  • lib/collection/src/operations/types.rs (4 hunks)
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs (1 hunks)
  • lib/storage/src/content_manager/toc/create_collection.rs (4 hunks)
  • tests/openapi/test_vector_config.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • lib/collection/src/collection/state_management.rs
  • lib/api/src/grpc/proto/collections.proto
  • tests/openapi/test_vector_config.py
  • lib/api/src/grpc/qdrant.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/config.rs
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs
  • lib/collection/src/operations/types.rs
  • lib/collection/src/operations/conversions.rs
  • lib/collection/src/collection/collection_ops.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/config.rs
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs
  • lib/collection/src/operations/types.rs
  • lib/collection/src/operations/conversions.rs
  • lib/collection/src/collection/collection_ops.rs
🧬 Code graph analysis (3)
lib/collection/src/operations/types.rs (1)
lib/segment/src/common/anonymize.rs (13)
  • anonymize (45-45)
  • anonymize (49-51)
  • anonymize (55-57)
  • anonymize (61-63)
  • anonymize (67-71)
  • anonymize (75-79)
  • anonymize (112-116)
  • anonymize (120-124)
  • anonymize (128-132)
  • anonymize (136-145)
  • anonymize (149-151)
  • anonymize (155-159)
  • anonymize (163-178)
lib/collection/src/operations/conversions.rs (1)
lib/collection/src/operations/types.rs (16)
  • from (99-106)
  • from (200-222)
  • from (267-287)
  • from (652-654)
  • from (672-674)
  • from (1166-1171)
  • from (1175-1180)
  • from (1184-1234)
  • from (1238-1240)
  • from (1244-1249)
  • from (1253-1258)
  • from (1262-1267)
  • from (1271-1276)
  • from (1280-1285)
  • from (1289-1294)
  • from (1298-1303)
lib/collection/src/collection/collection_ops.rs (2)
src/settings.rs (1)
  • log (347-352)
lib/collection/src/collection/mod.rs (1)
  • name (341-343)
🪛 markdownlint-cli2 (0.18.1)
docs/grpc/docs.md

27-27: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: integration-tests
  • GitHub Check: test-consistency
  • GitHub Check: e2e-tests
  • GitHub Check: storage-compat-test
  • GitHub Check: lint
🔇 Additional comments (12)
lib/storage/src/content_manager/toc/create_collection.rs (4)

167-175: LGTM on unified update_opt usage

Consistent use of update_opt for wal/optimizers/HNSW simplifies defaults application and aligns with the new DiffConfig API.

Please confirm update_opt preserves node-level defaults when diffs are None and properly merges when diffs are Some(..).


193-193: Strict mode: correct direction of update

Switch to default_config.update(diff)? is correct with the new trait semantics.


211-211: Field rename alignment

optimizer_config field usage matches the refactor; looks good.


252-253: Don’t fail collection creation on warning logging

Make warning emission non-fatal.

-        collection.print_warnings().await?;
+        if let Err(err) = collection.print_warnings().await {
+            log::warn!(
+                "Failed to print collection warnings for {}: {}",
+                collection_name,
+                err
+            );
+        }
lib/collection/src/collection/collection_ops.rs (1)

37-37: Config diff application direction is correct

Assignments using current.update(diff)? across params/HNSW/optimizers/strict mode align with the refactor.

Please verify all call sites transitioned off the old diff.update(&config) to avoid divergence across modules.

Also applies to: 54-54, 107-107, 190-190

lib/storage/src/content_manager/toc/collection_meta_ops.rs (1)

182-183: Make printing warnings non-fatal during updates

Don’t fail UpdateCollection if printing warnings errors out.

-        collection.print_warnings().await?;
+        if let Err(err) = collection.print_warnings().await {
+            log::warn!(
+                "Failed to print collection warnings for {}: {}",
+                operation.collection_name,
+                err
+            );
+        }
docs/redoc/master/openapi.json (1)

6610-6617: LGTM on the new warnings schema.

CollectionInfo.warnings and the CollectionWarning type look well-specified and align with the documented behavior. ✔️

Also applies to: 6685-6696

lib/collection/src/config.rs (2)

279-310: LGTM! Well-structured configuration validation.

The get_warnings method effectively validates cross-field requirements for the copy_vectors HNSW option. The checks correctly identify when quantization is missing or when multivectors are incompatible.


452-452: LGTM! Consistent with the new DiffConfig API pattern.

The change from hnsw_diff.update(existing_hnsw) to existing_hnsw.update(hnsw_diff) correctly aligns with the refactored DiffConfig trait where diffs are applied onto configs.

lib/collection/src/operations/types.rs (1)

130-138: LGTM! Properly configured warning type.

The CollectionWarning struct is well-designed with appropriate anonymization for the message field that may contain vector names. Based on learnings.

lib/collection/src/operations/conversions.rs (1)

539-542: LGTM! Correct conversion to gRPC warnings.

The conversion properly maps internal CollectionWarning instances to their gRPC equivalents.

docs/grpc/docs.md (1)

27-27: Fix markdown list indentation to satisfy markdownlint (MD007).

The newly added Table of Contents item still uses a 4-space indent, so markdownlint (MD007) fails. Please reduce the indent for this bullet to 2 spaces to match the linter’s expectation and unblock CI (Line 27).

@xzfc xzfc force-pushed the configuration_status branch 2 times, most recently from f3414c4 to 47da93e Compare September 29, 2025 15:10
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
tests/openapi/test_vector_config.py (3)

247-254: Guarantee cleanup on failures with try/finally.

If assertions fail before the last drop_collection, collections remain and may affect later tests. Wrap body in try/finally.

 def test_configuration_warnings(config_name, vector_config, expected_warnings):
     test_collection = f"test_{config_name}"
-    drop_collection(collection_name=test_collection)
-
-    # Create collection with specified configuration
-    base_vector_config = {"size": 4, "distance": "Dot"}
-    base_vector_config.update(vector_config)
+    drop_collection(collection_name=test_collection)
+    try:
+        # Create collection with specified configuration
+        base_vector_config = {"size": 4, "distance": "Dot"}
+        base_vector_config.update(vector_config)
@@
-    warnings = response.json()["result"]["warnings"]
-    assert len(warnings) == expected_warnings
-
-    drop_collection(collection_name=test_collection)
+        warnings = response.json()["result"]["warnings"]
+        assert len(warnings) == expected_warnings
+    finally:
+        drop_collection(collection_name=test_collection)

Also applies to: 273-273


255-261: Avoid flakiness: create collection with wait=true.

Some deployments apply config asynchronously. Add wait=true to the create call to ensure warnings are available immediately.

 response = request_with_validation(
     api="/collections/{collection_name}",
     method="PUT",
     path_params={"collection_name": test_collection},
+    query_params={"wait": "true"},
     body={"vectors": {"test_vector": base_vector_config}},
 )

Please confirm whether your CI cluster runs synchronous collection creation by default; if not, this change prevents intermittent failures.


270-271: Strengthen assertions by checking warning content (keep count).

Counting only length might miss regressions in semantics. Optionally assert that messages mention the violated constraints.

-    warnings = response.json()["result"]["warnings"]
-    assert len(warnings) == expected_warnings
+    warnings = response.json()["result"]["warnings"]
+    assert isinstance(warnings, list)
+    assert len(warnings) == expected_warnings
+    # Optional: semantic checks without hard-coding full messages
+    messages = [w.get("message", "") for w in warnings]
+    if config_name == "copy_vectors_no_quant":
+        assert any("quantization" in m.lower() for m in messages)
+    if config_name == "copy_vectors_multivec":
+        assert any("quantization" in m.lower() for m in messages)
+        assert any("multivec" in m.lower() or "multi" in m.lower() for m in messages)
docs/redoc/master/openapi.json (1)

6610-6616: Document the empty-array default for warnings.

The PR summary (and implementation) advertise that CollectionInfo.warnings is emitted as an empty list when there are no warnings, but the OpenAPI schema doesn’t reflect that. Please add default: [] to the property so generated clients and docs continue to show the guaranteed empty array instead of null/omitted.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6d2b4ac and f3414c4.

📒 Files selected for processing (12)
  • docs/grpc/docs.md (3 hunks)
  • docs/redoc/master/openapi.json (2 hunks)
  • lib/api/src/grpc/proto/collections.proto (2 hunks)
  • lib/api/src/grpc/qdrant.rs (2 hunks)
  • lib/collection/src/collection/collection_ops.rs (5 hunks)
  • lib/collection/src/collection/state_management.rs (1 hunks)
  • lib/collection/src/config.rs (3 hunks)
  • lib/collection/src/operations/conversions.rs (5 hunks)
  • lib/collection/src/operations/types.rs (4 hunks)
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs (1 hunks)
  • lib/storage/src/content_manager/toc/create_collection.rs (4 hunks)
  • tests/openapi/test_vector_config.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • lib/collection/src/collection/state_management.rs
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs
  • lib/api/src/grpc/proto/collections.proto
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/collection/src/operations/conversions.rs
  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/collection/collection_ops.rs
  • lib/collection/src/config.rs
  • lib/collection/src/operations/types.rs
  • lib/api/src/grpc/qdrant.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/collection/src/operations/conversions.rs
  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/collection/collection_ops.rs
  • lib/collection/src/config.rs
  • lib/collection/src/operations/types.rs
  • lib/api/src/grpc/qdrant.rs
🧠 Learnings (1)
📚 Learning: 2025-09-16T19:14:17.614Z
Learnt from: generall
PR: qdrant/qdrant#7183
File: lib/api/src/grpc/qdrant.rs:4263-4273
Timestamp: 2025-09-16T19:14:17.614Z
Learning: In qdrant, lib/api/src/grpc/qdrant.rs is auto-generated by prost-build; do not edit it directly. Make changes in lib/api/src/grpc/proto/points.proto (e.g., add [deprecated=true], doc comments, or encoding options), then regenerate the Rust code.

Applied to files:

  • lib/api/src/grpc/qdrant.rs
🧬 Code graph analysis (4)
tests/openapi/test_vector_config.py (2)
tests/openapi/helpers/collection_setup.py (1)
  • drop_collection (8-14)
tests/openapi/helpers/helpers.py (1)
  • request_with_validation (41-116)
lib/collection/src/operations/conversions.rs (1)
lib/collection/src/operations/types.rs (16)
  • from (99-106)
  • from (200-222)
  • from (267-287)
  • from (652-654)
  • from (672-674)
  • from (1166-1171)
  • from (1175-1180)
  • from (1184-1234)
  • from (1238-1240)
  • from (1244-1249)
  • from (1253-1258)
  • from (1262-1267)
  • from (1271-1276)
  • from (1280-1285)
  • from (1289-1294)
  • from (1298-1303)
lib/collection/src/collection/collection_ops.rs (1)
lib/collection/src/collection/mod.rs (1)
  • name (341-343)
lib/collection/src/operations/types.rs (1)
lib/segment/src/common/anonymize.rs (13)
  • anonymize (45-45)
  • anonymize (49-51)
  • anonymize (55-57)
  • anonymize (61-63)
  • anonymize (67-71)
  • anonymize (75-79)
  • anonymize (112-116)
  • anonymize (120-124)
  • anonymize (128-132)
  • anonymize (136-145)
  • anonymize (149-151)
  • anonymize (155-159)
  • anonymize (163-178)
🪛 markdownlint-cli2 (0.18.1)
docs/grpc/docs.md

27-27: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: lint
  • GitHub Check: test-consistency
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: e2e-tests
  • GitHub Check: integration-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests-consensus
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
🔇 Additional comments (19)
docs/redoc/master/openapi.json (1)

6685-6694: New CollectionWarning schema looks good.

The object definition is minimal and matches the API surface.

lib/storage/src/content_manager/toc/create_collection.rs (4)

167-175: Config diffs applied with update_opt — LGTM

Switch to update_opt for WAL/optimizers/HNSW aligns with the new DiffConfig semantics.


193-193: Strict mode: corrected update(arg) order — LGTM

Using default_config.update(diff)? matches the refactored API.


211-211: Field bind consistency — LGTM

optimizer_config field now bound to the computed optimizer_config variable.


252-253: Non-fatal warning emission on create — LGTM

Calling print_warnings() without propagating errors prevents creation from failing on logging.

lib/collection/src/config.rs (2)

279-311: Per-vector warnings generator — LGTM

Merges HNSW defaults with per-vector diff, checks quantization and multivector incompatibility, and returns structured warnings.


452-452: HNSW diff merge direction fixed — LGTM

existing_hnsw.update(hnsw_diff)? matches the new DiffConfig contract.

lib/collection/src/operations/types.rs (3)

130-139: Introduce CollectionWarning with anonymized message — LGTM

message is marked #[anonymize(true)], addressing concerns about vector names in logs.


232-234: Add warnings to CollectionInfo with skip when empty — LGTM

#[serde(default, skip_serializing_if = "Vec::is_empty")] minimizes client breakage.


256-257: Populate warnings from config — LGTM

get_warnings() used in both empty() and From.

Also applies to: 280-281

lib/collection/src/collection/collection_ops.rs (5)

37-38: Params diff application — LGTM

config.params = config.params.update(diff)? matches new DiffConfig API.


54-55: HNSW diff application — LGTM


107-108: Optimizers diff application — LGTM


189-190: Strict mode update via update(diff) — LGTM


409-414: print_warnings helper — LGTM

Logs each warning with collection context; non-fatal by design.

docs/grpc/docs.md (2)

27-27: Fix MD007: list indentation in ToC

Reduce indent from 4 to 2 spaces for the CollectionWarning entry.

-    - [CollectionWarning](#qdrant-CollectionWarning)
+  - [CollectionWarning](#qdrant-CollectionWarning)

612-612: API docs for warnings and CollectionWarning — LGTM

Field documented in CollectionInfo and dedicated section added.

Also applies to: 691-700

lib/collection/src/operations/conversions.rs (2)

383-543: Great end-to-end propagation of collection warnings.

The CollectionInfo → gRPC conversion now forwards warnings alongside payload schema, keeping the API responses consistent with the new diagnostics surface. Nicely done.


547-559: Thanks for the symmetric CollectionWarning conversions.

The dedicated From implementations make the round-trip between internal and gRPC warnings trivial and keep call sites clean.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (9)
lib/api/src/grpc/qdrant.rs (1)

378-384: New CollectionWarning message: OK; consider future-proofing.

  • Derives/encoding look consistent with the rest of the API; tag = 1 is fine.
  • Optional: if you anticipate programmatic handling, consider adding a lightweight code/category or level in the proto now to avoid string parsing later; or leave as-is if you want maximal flexibility first.

Based on learnings

docs/redoc/master/openapi.json (2)

6610-6616: Make client ergonomics explicit for CollectionInfo.warnings.

If the backend always returns an array (even when empty), consider either:

  • adding "default": [] to the schema, or
  • marking it as required (breaking), or
  • clarifying in the description that it’s always present and empty when no warnings.

Non‑breaking tweak (preferred):

           "warnings": {
             "description": "Warnings related to the collection",
             "type": "array",
+            "default": [],
             "items": {
               "$ref": "#/components/schemas/CollectionWarning"
             }
           },

6684-6695: Future‑proof CollectionWarning with machine‑readable fields.

To avoid hard‑parsing free‑text later, add optional fields now (keeps BC) such as a stable code and context:

       "CollectionWarning": {
         "type": "object",
         "required": [
           "message"
         ],
         "properties": {
           "message": {
             "description": "Warning message",
             "type": "string"
-          }
+          },
+          "code": {
+            "description": "Stable machine-readable identifier (e.g. `hnsw_copy_vectors_requires_quantization`).",
+            "type": "string"
+          },
+          "field_path": {
+            "description": "JSON pointer or dot path to the related config field (e.g. `config.hnsw_config.copy_vectors`).",
+            "type": "string"
+          },
+          "details": {
+            "description": "Optional structured details for programmatic handling.",
+            "type": "object",
+            "additionalProperties": true
+          }
         }
       },
tests/openapi/test_vector_config.py (2)

224-277: Make cleanup resilient to failures

Wrap create/assertions in try/finally so the collection is dropped even if an assertion fails, preventing test pollution across parametrized cases.

 def test_configuration_warnings(config_name, vector_config, expected_warnings):
     test_collection = f"test_{config_name}"
-    drop_collection(collection_name=test_collection)
-
-    # Create collection with specified configuration
-    base_vector_config = {"size": 4, "distance": "Dot"}
-    base_vector_config.update(vector_config)
-
-    response = request_with_validation(
-        api="/collections/{collection_name}",
-        method="PUT",
-        path_params={"collection_name": test_collection},
-        body={"vectors": {"test_vector": base_vector_config}},
-    )
-    assert response.ok
-
-    response = request_with_validation(
-        api="/collections/{collection_name}",
-        method="GET",
-        path_params={"collection_name": test_collection},
-    )
-    assert response.ok
-
-    if expected_warnings > 0:
-        warnings = response.json()["result"]["warnings"]
-        assert len(warnings) == expected_warnings
-    else:
-        assert "warnings" not in response.json()["result"]
-
-    drop_collection(collection_name=test_collection)
+    drop_collection(collection_name=test_collection)
+    try:
+        # Create collection with specified configuration
+        base_vector_config = {"size": 4, "distance": "Dot"}
+        base_vector_config.update(vector_config)
+
+        response = request_with_validation(
+            api="/collections/{collection_name}",
+            method="PUT",
+            path_params={"collection_name": test_collection},
+            body={"vectors": {"test_vector": base_vector_config}},
+        )
+        assert response.ok
+
+        response = request_with_validation(
+            api="/collections/{collection_name}",
+            method="GET",
+            path_params={"collection_name": test_collection},
+        )
+        assert response.ok
+
+        if expected_warnings > 0:
+            warnings = response.json()["result"]["warnings"]
+            assert len(warnings) == expected_warnings
+        else:
+            assert "warnings" not in response.json()["result"]
+    finally:
+        drop_collection(collection_name=test_collection)

224-277: Add one case for collection-level quantization

Include a scenario where quantization is set at collection level (not per‑vector) with copy_vectors=true to ensure warnings are suppressed in that path as well.

lib/collection/src/config.rs (1)

279-311: get_warnings: logic is correct; consider adding a code field for stability

Validation correctly:

  • computes effective HNSW per vector via update_opt,
  • checks quantization at either vector or collection level,
  • blocks copy_vectors with multivectors.

Two small improvements:

  • Add a machine-stable code: &'static str to CollectionWarning (e.g., copy_vectors_requires_quantization, copy_vectors_incompatible_with_multivector) alongside the message for easier client handling and i18n.
  • Consider deduplicating identical warnings if many vectors share the issue (optional).
lib/collection/src/operations/conversions.rs (1)

378-545: Style nit: prefer explicit From over .into() per repo guideline

Where you have enum conversions using .into() in this block (e.g., status), prefer Type::from(value) for clarity and to match the guideline.

As per coding guidelines

lib/collection/src/collection/collection_ops.rs (1)

409-414: Improve warning logs with a stable target and structured fields

Use a logging target and structured fields so consumers can filter by qdrant::collection::warnings without parsing the message. Verified that print_warnings() is only called in non-hot paths (after collection creation, metadata update, and config save).

Verified invocations in:

  • lib/storage/src/content_manager/toc/create_collection.rs:252
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs:182
  • lib/collection/src/collection/state_management.rs:176
-        for warning in warnings {
-            log::warn!("Collection {}: {}", self.name(), warning.message);
-        }
+        for warning in warnings {
+            log::warn!(target: "qdrant::collection::warnings", "collection={} warning={}", self.name(), warning.message);
+        }
docs/grpc/docs.md (1)

691-700: Optional: add a short field description.

Consider clarifying that message is a human-readable warning (e.g., “configuration inconsistency”) to guide client UX.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f3414c4 and 47da93e.

📒 Files selected for processing (12)
  • docs/grpc/docs.md (3 hunks)
  • docs/redoc/master/openapi.json (2 hunks)
  • lib/api/src/grpc/proto/collections.proto (2 hunks)
  • lib/api/src/grpc/qdrant.rs (2 hunks)
  • lib/collection/src/collection/collection_ops.rs (5 hunks)
  • lib/collection/src/collection/state_management.rs (1 hunks)
  • lib/collection/src/config.rs (3 hunks)
  • lib/collection/src/operations/conversions.rs (5 hunks)
  • lib/collection/src/operations/types.rs (4 hunks)
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs (1 hunks)
  • lib/storage/src/content_manager/toc/create_collection.rs (4 hunks)
  • tests/openapi/test_vector_config.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs
  • lib/api/src/grpc/proto/collections.proto
  • lib/collection/src/collection/state_management.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/api/src/grpc/qdrant.rs
  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/operations/conversions.rs
  • lib/collection/src/config.rs
  • lib/collection/src/operations/types.rs
  • lib/collection/src/collection/collection_ops.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/api/src/grpc/qdrant.rs
  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/operations/conversions.rs
  • lib/collection/src/config.rs
  • lib/collection/src/operations/types.rs
  • lib/collection/src/collection/collection_ops.rs
🧠 Learnings (1)
📚 Learning: 2025-09-16T19:14:17.614Z
Learnt from: generall
PR: qdrant/qdrant#7183
File: lib/api/src/grpc/qdrant.rs:4263-4273
Timestamp: 2025-09-16T19:14:17.614Z
Learning: In qdrant, lib/api/src/grpc/qdrant.rs is auto-generated by prost-build; do not edit it directly. Make changes in lib/api/src/grpc/proto/points.proto (e.g., add [deprecated=true], doc comments, or encoding options), then regenerate the Rust code.

Applied to files:

  • lib/api/src/grpc/qdrant.rs
🧬 Code graph analysis (4)
lib/collection/src/operations/conversions.rs (1)
lib/collection/src/operations/types.rs (16)
  • from (99-106)
  • from (200-222)
  • from (267-287)
  • from (652-654)
  • from (672-674)
  • from (1166-1171)
  • from (1175-1180)
  • from (1184-1234)
  • from (1238-1240)
  • from (1244-1249)
  • from (1253-1258)
  • from (1262-1267)
  • from (1271-1276)
  • from (1280-1285)
  • from (1289-1294)
  • from (1298-1303)
lib/collection/src/operations/types.rs (1)
lib/segment/src/common/anonymize.rs (13)
  • anonymize (45-45)
  • anonymize (49-51)
  • anonymize (55-57)
  • anonymize (61-63)
  • anonymize (67-71)
  • anonymize (75-79)
  • anonymize (112-116)
  • anonymize (120-124)
  • anonymize (128-132)
  • anonymize (136-145)
  • anonymize (149-151)
  • anonymize (155-159)
  • anonymize (163-178)
tests/openapi/test_vector_config.py (2)
tests/openapi/helpers/collection_setup.py (1)
  • drop_collection (8-14)
tests/openapi/helpers/helpers.py (1)
  • request_with_validation (41-116)
lib/collection/src/collection/collection_ops.rs (2)
src/settings.rs (1)
  • log (347-352)
lib/collection/src/collection/mod.rs (1)
  • name (341-343)
🪛 markdownlint-cli2 (0.18.1)
docs/grpc/docs.md

27-27: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: lint
  • GitHub Check: test-consensus-compose
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: e2e-tests
  • GitHub Check: test-consistency
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: integration-tests
🔇 Additional comments (20)
lib/storage/src/content_manager/toc/create_collection.rs (4)

167-175: Clean refactoring to use update_opt helper

The simplified config update pattern using update_opt removes boilerplate and improves readability.


193-193: Consistent with new DiffConfig trait design

The updated method call order default_config.update(diff) follows the more intuitive pattern where the config object updates itself with the diff.


211-211: Correct variable reference after renaming

The field now correctly references optimizer_config variable from line 169.


252-252: Don't fail collection creation on warning logging

Warning emission should not block collection creation. If print_warnings() fails, the collection creation will abort unnecessarily.

Apply this diff to make warning emission non-fatal:

-        collection.print_warnings().await;
+        if let Err(err) = collection.print_warnings().await {
+            log::warn!(
+                "Failed to print collection warnings for {}: {}",
+                collection_name,
+                err
+            );
+        }
lib/api/src/grpc/qdrant.rs (2)

1-1: Do not edit this file directly (generated by prost-build).

Confirm the proto definitions under lib/api/src/grpc/proto/*.proto were changed and this file was regenerated, not hand-edited. Based on learnings.


1243-1246: End-to-end wiring complete for CollectionInfo.warnings. Proto, conversion code, gRPC docs, and OpenAPI schema all include the new warnings field.

lib/collection/src/collection/collection_ops.rs (4)

37-38: In-place diff application looks good

Switching to config.params = config.params.update(params_diff)?; aligns with the refactored DiffConfig API and avoids mutating borrowed data. Same pattern elsewhere in this file is consistent.

Please confirm that all callers re-run optimizer (recreate_optimizers_blocking) after these updates as required by the docstrings.


54-55: Consistent HNSW update application

Using config.hnsw_config.update(hnsw_config_diff)? keeps semantics local and mirrors other update methods. LGTM.


107-108: Optimizer config update follows the same pattern

Consistent with DiffConfig refactor; no issues spotted.


189-190: Strict mode: safe in-place update

Assigning *current_config = current_config.update(strict_mode_diff)?; is correct and avoids ownership pitfalls. Good.

lib/collection/src/config.rs (2)

23-27: Public import of CollectionWarning is appropriate

Bringing CollectionWarning into scope here keeps warning construction local to config validation. LGTM.


452-455: HNSW merge direction fix is correct

Using existing_hnsw.update(hnsw_diff)? (instead of the inverse) matches the refactored API and ensures diffs extend the current config. Good.

lib/collection/src/operations/types.rs (3)

130-139: CollectionWarning: minimal and anonymized—good starting point

#[anonymize(true)] on message addresses privacy concerns mentioned in earlier reviews. Structure is simple and forward‑compatible.


232-235: Skip-serialize on empty avoids breaking older clients

#[serde(default, skip_serializing_if = "Vec::is_empty")] is the right choice to minimize compatibility risk. LGTM.


255-257: Populate warnings from config at source

Initializing from collection_config.get_warnings() ensures a single source of truth. No merge needed across shards since config is shared. Looks good.

Also applies to: 280-281

lib/collection/src/operations/conversions.rs (3)

539-543: Propagate warnings to gRPC response

Mapping Vec<CollectionWarning> to proto with explicit From is clear and consistent. LGTM.


547-559: Bidirectional conversions for CollectionWarning

Round‑trip conversions are complete and minimal. Good.


868-869: Parsing warnings from gRPC response is correct

Maps proto warnings to internal with CollectionWarning::from. No issues.

Also applies to: 899-900

docs/grpc/docs.md (2)

27-27: Fix MD007: reduce ToC indentation to 2 spaces for this item.

Align with markdownlint (expected 2). If this file is generated, consider disabling MD007 for this path to avoid churn.

-    - [CollectionWarning](#qdrant-CollectionWarning)
+  - [CollectionWarning](#qdrant-CollectionWarning)

612-612: LGTM: CollectionInfo now exposes warnings.

Repeated CollectionWarning field is well-placed and backward compatible.

@xzfc xzfc force-pushed the configuration_status branch from 47da93e to 0693595 Compare September 29, 2025 17:55
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
lib/api/src/grpc/qdrant.rs (1)

378-384: CollectionWarning message looks good

Simple, extensible container for warning text; aligns with other messages and derives. If you anticipate programmatic handling later, consider adding an optional machine-readable code in the proto (e.g., enum or string code) alongside the message. Optional for now.

Based on learnings.

lib/collection/src/collection/state_management.rs (1)

176-177: Log errors from print_warnings instead of silently ignoring them.

The current implementation discards the Result from print_warnings().await, which means failures go unnoticed. While this correctly prevents warning failures from aborting apply_config, it's better to log any errors for observability.

Apply this diff:

-        self.print_warnings().await;
+        if let Err(err) = self.print_warnings().await {
+            log::warn!("Failed to print collection warnings for {}: {}", self.id, err);
+        }
docs/redoc/master/openapi.json (2)

6610-6616: Make warnings readOnly and default to empty array for stable client codegen

This field is response-only and optional. Mark it readOnly and add default: []/minItems: 0 so generators don’t emit nullable types or force null checks when you intend “zero or more warnings”.

           "warnings": {
             "description": "Warnings related to the collection",
             "type": "array",
+            "readOnly": true,
+            "default": [],
+            "minItems": 0,
             "items": {
               "$ref": "#/components/schemas/CollectionWarning"
             }
           },

Please also ensure sample responses for GET /collections/{collection_name} include "warnings": [] when empty so behavior matches the schema.


6684-6695: Consider adding an optional machine-readable code (and an example) to CollectionWarning

A stable code makes it easier for SDKs and UIs to branch on specific warnings without string-matching messages. Keep it optional to preserve flexibility. Adding an example clarifies expected content.

       "CollectionWarning": {
         "type": "object",
         "required": [
           "message"
         ],
         "properties": {
           "message": {
             "description": "Warning message",
             "type": "string"
           },
+          "code": {
+            "description": "Stable, machine-readable warning code (e.g., `hnsw_copy_vectors_requirements_not_met`).",
+            "type": "string",
+            "nullable": true
+          }
         }
+        ,
+        "example": { "message": "HNSW copy_vectors requires quantization enabled and multivectors disabled.", "code": "hnsw_copy_vectors_requirements_not_met" }
       },

Confirm the gRPC/proto CollectionWarning mirrors this surface (optional code) to avoid REST/GRPC drift.

lib/collection/src/operations/config_diff.rs (1)

276-298: Consider error handling for to_diff().unwrap() calls.

All From implementations now use config.to_diff().unwrap() which will panic if the JSON serialization fails. While this should never happen for well-formed config types, consider whether these conversions could be fallible (returning Result) or if the panic is acceptable for internal conversions.

If panics are acceptable here (likely given these are internal type conversions), the current approach is fine.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 47da93e and 0693595.

📒 Files selected for processing (16)
  • docs/grpc/docs.md (3 hunks)
  • docs/redoc/master/openapi.json (2 hunks)
  • lib/api/src/grpc/proto/collections.proto (2 hunks)
  • lib/api/src/grpc/qdrant.rs (2 hunks)
  • lib/collection/src/collection/collection_ops.rs (5 hunks)
  • lib/collection/src/collection/mod.rs (3 hunks)
  • lib/collection/src/collection/state_management.rs (1 hunks)
  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs (2 hunks)
  • lib/collection/src/collection_manager/optimizers/segment_optimizer.rs (1 hunks)
  • lib/collection/src/config.rs (4 hunks)
  • lib/collection/src/operations/config_diff.rs (8 hunks)
  • lib/collection/src/operations/conversions.rs (5 hunks)
  • lib/collection/src/operations/types.rs (4 hunks)
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs (1 hunks)
  • lib/storage/src/content_manager/toc/create_collection.rs (4 hunks)
  • tests/openapi/test_vector_config.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • tests/openapi/test_vector_config.py
  • lib/api/src/grpc/proto/collections.proto
  • lib/collection/src/collection_manager/optimizers/segment_optimizer.rs
  • lib/storage/src/content_manager/toc/create_collection.rs
  • lib/collection/src/collection/collection_ops.rs
  • lib/storage/src/content_manager/toc/collection_meta_ops.rs
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs
  • lib/collection/src/config.rs
  • lib/api/src/grpc/qdrant.rs
  • lib/collection/src/operations/config_diff.rs
  • lib/collection/src/collection/mod.rs
  • lib/collection/src/operations/types.rs
  • lib/collection/src/operations/conversions.rs
  • lib/collection/src/collection/state_management.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs
  • lib/collection/src/config.rs
  • lib/api/src/grpc/qdrant.rs
  • lib/collection/src/operations/config_diff.rs
  • lib/collection/src/collection/mod.rs
  • lib/collection/src/operations/types.rs
  • lib/collection/src/operations/conversions.rs
  • lib/collection/src/collection/state_management.rs
🧠 Learnings (2)
📚 Learning: 2025-09-16T21:39:09.353Z
Learnt from: xzfc
PR: qdrant/qdrant#7232
File: lib/shard/src/segment_holder/mod.rs:1088-1094
Timestamp: 2025-09-16T21:39:09.353Z
Learning: The hnsw_format_force and hnsw_with_vectors flags in HnswGlobalConfig primarily affect existing segment conversion during load/startup, not the initial format selection for newly created empty segments. For new segments, format decisions are made by GraphLinksFormatParam::recommended() based on runtime conditions. Therefore, using HnswGlobalConfig::default() for temporary/proxy segments in SegmentHolder::build_tmp_segment is acceptable.

Applied to files:

  • lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs
📚 Learning: 2025-09-16T19:14:17.614Z
Learnt from: generall
PR: qdrant/qdrant#7183
File: lib/api/src/grpc/qdrant.rs:4263-4273
Timestamp: 2025-09-16T19:14:17.614Z
Learning: In qdrant, lib/api/src/grpc/qdrant.rs is auto-generated by prost-build; do not edit it directly. Make changes in lib/api/src/grpc/proto/points.proto (e.g., add [deprecated=true], doc comments, or encoding options), then regenerate the Rust code.

Applied to files:

  • lib/api/src/grpc/qdrant.rs
🧬 Code graph analysis (2)
lib/collection/src/operations/types.rs (1)
lib/segment/src/common/anonymize.rs (13)
  • anonymize (45-45)
  • anonymize (49-51)
  • anonymize (55-57)
  • anonymize (61-63)
  • anonymize (67-71)
  • anonymize (75-79)
  • anonymize (112-116)
  • anonymize (120-124)
  • anonymize (128-132)
  • anonymize (136-145)
  • anonymize (149-151)
  • anonymize (155-159)
  • anonymize (163-178)
lib/collection/src/operations/conversions.rs (1)
lib/collection/src/operations/types.rs (16)
  • from (99-106)
  • from (200-222)
  • from (267-287)
  • from (652-654)
  • from (672-674)
  • from (1166-1171)
  • from (1175-1180)
  • from (1184-1234)
  • from (1238-1240)
  • from (1244-1249)
  • from (1253-1258)
  • from (1262-1267)
  • from (1271-1276)
  • from (1280-1285)
  • from (1289-1294)
  • from (1298-1303)
🪛 markdownlint-cli2 (0.18.1)
docs/grpc/docs.md

27-27: Unordered list indentation
Expected: 2; Actual: 4

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-consensus-compose
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: test-consistency
  • GitHub Check: integration-tests
  • GitHub Check: integration-tests-consensus
  • GitHub Check: e2e-tests
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: lint
🔇 Additional comments (18)
lib/collection/src/collection_manager/optimizers/config_mismatch_optimizer.rs (2)

115-120: LGTM! Refactor simplifies config merging.

The removal of the get_required_hnsw_config helper and direct use of update_opt streamlines the logic. The collection-wide HNSW config serves as the base, with vector-specific overrides applied when present, which is semantically correct and aligns with the broader DiffConfig refactor.


580-585: LGTM! Test assertions updated to reflect DiffConfig refactor.

The test expectations correctly reflect the swapped argument order in DiffConfig::update. The collection-wide config is now the receiver with vector-specific config applied as a diff, which maintains the same logical semantics.

lib/api/src/grpc/qdrant.rs (1)

1243-1246: CollectionInfo.warnings addition verified: properly sourced from proto and included in REST API

The addition is backward-compatible and correctly implemented. Verification confirms:

  • Proto definitions exist at lib/api/src/grpc/proto/collections.proto (message at line 165, field at line 579 with tag 11)
  • File is generated by prost-build (not hand-edited)
  • OpenAPI schema includes warnings in CollectionInfo schema as an array of CollectionWarning
lib/collection/src/config.rs (3)

2-2: LGTM!

The import additions support the new get_warnings method: Write as _ for string formatting and CollectionWarning for the return type and warning construction.

Also applies to: 25-26


279-310: LGTM! Warning generation logic is correct.

The method correctly validates the requirements for hnsw_config.copy_vectors:

  • Quantization must be enabled (per-vector or collection-level).
  • Multivectors must be disabled.

Warnings include vector names in their messages. Anonymization is handled at the CollectionWarning type level (via #[anonymize(true)] on the message field in types.rs), so no additional handling is needed here.


284-284: LGTM! Consistent with DiffConfig trait refactoring.

The HNSW merge behavior now uses config.update(diff) instead of diff.update(&config), consistent with the DiffConfig trait refactoring described in the PR objectives. The use of update_opt at line 284 correctly handles optional diffs.

Also applies to: 450-455

lib/collection/src/operations/types.rs (2)

130-138: LGTM! CollectionWarning struct is well-defined.

The struct correctly uses #[anonymize(true)] on the message field to protect potentially sensitive information (e.g., vector names) as per past review feedback. The derives and serde configuration are appropriate.

Based on learnings.


232-234: LGTM! Warnings field correctly integrated.

The warnings field is properly added to CollectionInfo with skip_serializing_if = "Vec::is_empty" to minimize impact on older clients. Initialization in both empty() and From<ShardInfoInternal> correctly uses config.get_warnings() to populate warnings from configuration validation.

Note: As discussed in past reviews, some older clients may still have issues with unknown fields regardless of this annotation, but this approach minimizes the impact.

Based on learnings.

Also applies to: 256-256, 280-280

lib/collection/src/collection/mod.rs (1)

133-136: LGTM! OptimizersConfig.update calls correctly refactored.

All three call sites are updated to use the new config.update(diff) pattern, consistent with the DiffConfig trait refactoring. Error handling is appropriate for each context:

  • Lines 135: propagates errors via ?.
  • Line 258: uses expect for load-time failures.
  • Line 850: propagates errors via ?.

Also applies to: 253-259, 846-854

lib/collection/src/operations/conversions.rs (3)

52-56: LGTM!

Import correctly adds CollectionWarning to support the new conversion implementations.


378-544: LGTM! CollectionInfo to gRPC conversion correctly propagates warnings.

The warnings field is properly destructured at line 383 and mapped to gRPC types at lines 539-542. The conversion correctly uses into_iter().map().collect() to transform each warning.


547-559: LGTM! CollectionWarning conversions are correct.

The bidirectional From implementations (lines 547-559) correctly map the message field between internal and gRPC types. The TryFrom implementation (lines 868, 899) properly destructures and converts warnings from gRPC responses to internal types.

Also applies to: 849-904

docs/grpc/docs.md (3)

27-27: Markdown indentation issue already flagged.

This ToC entry has 4-space indentation instead of the expected 2 spaces, violating MD007. Since this is generated documentation, consider fixing the generator rather than manually editing.


691-700: LGTM! CollectionWarning documentation is clear and consistent.

The documentation for the new CollectionWarning message follows the established pattern and clearly describes the message field.


612-612: LGTM! Warnings field properly integrated into CollectionInfo.

The warnings field addition correctly uses the repeated modifier to support multiple warnings and has a clear description.

lib/collection/src/operations/config_diff.rs (3)

23-48: LGTM! DiffConfig trait properly refactored with parameterized Diff type.

The trait refactoring:

  • Parameterizes Diff type with appropriate bounds
  • Adds update_opt for graceful error handling with fallback to clone
  • Adds to_diff for conversion from config to diff
  • Uses &self in update, requiring a clone in the implementation

The error handling in update_opt (lines 37-40) appropriately logs failures and falls back to the original config.


236-273: LGTM! DiffConfig implementations are consistent and correct.

The implementations properly specialize the trait for each config type:

  • Most types use the default trait implementation via update_config
  • OptimizersConfig provides custom logic to handle the MaxOptimizationThreads conversion (line 265)
  • The pattern field.unwrap_or(self.field) correctly preserves original values when diff fields are None

424-487: LGTM! Tests properly updated for new DiffConfig API.

All test cases correctly updated to use the new config.update(diff) pattern and handle the CollectionResult return type.

@xzfc xzfc merged commit 780d406 into dev Sep 29, 2025
16 checks passed
@xzfc xzfc deleted the configuration_status branch September 29, 2025 18:17
timvisee pushed a commit that referenced this pull request Nov 14, 2025
* refactor: swap DiffConfig parmeters

was: diff.update(&config)
now: config.update(diff)

* refactor: introduce DiffConfig::update_opt

* feat: add CollectionInfo::configuration_status field

* rename ConfigurationStatus to CollectionWarning
@timvisee timvisee mentioned this pull request Nov 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants