Skip to content

[strict-mode] Add max number of payload index count#7222

Merged
agourlay merged 5 commits intodevfrom
strict-mode-max-payload-index-count
Sep 15, 2025
Merged

[strict-mode] Add max number of payload index count#7222
agourlay merged 5 commits intodevfrom
strict-mode-max-payload-index-count

Conversation

@agourlay
Copy link
Member

@agourlay agourlay commented Sep 8, 2025

This PR adds a new configuration field to the strict mode to limit the maximum number of payload index.

The check is performed when a payload index creation is requested.

The implementation is rather naive by fetching the full collection info on each request.

I think this acceptable because this is not a high throughput operation.

@agourlay agourlay force-pushed the strict-mode-max-payload-index-count branch from 3bb7cff to dad8e66 Compare September 9, 2025 13:51
@agourlay agourlay marked this pull request as ready for review September 9, 2025 13:57
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (8)
lib/segment/src/types.rs (2)

1029-1034: LGTM; tweak doc string grammar.

Change “payload index” -> “payload indexes”.

-    /// Max number of payload index in a collection
+    /// Max number of payload indexes in a collection

1174-1178: Mirror doc string grammar fix in output type.

Same pluralization nit in StrictModeConfigOutput.

-    /// Max number of payload index in a collection
+    /// Max number of payload indexes in a collection
docs/redoc/master/openapi.json (2)

7451-7458: Output schema: wording + minimum alignment with other strict-mode fields

  • Grammar: use “payload indexes” (or “indices”) plural.
  • Consistency: other output strict-mode limits (e.g., max_points_count) allow minimum: 0 while the config (input) uses minimum: 1. Consider minimum: 0 here as well so users can explicitly disallow payload indexes via output, mirroring existing pattern.

Apply:

           "sparse_config": {
             "description": "Sparse vector configuration",
             "anyOf": [
               {
                 "$ref": "#/components/schemas/StrictModeSparseConfigOutput"
               },
               {
                 "nullable": true
               }
             ]
           },
-          "max_payload_index_count": {
-            "description": "Max number of payload index in a collection",
+          "max_payload_index_count": {
+            "description": "Max number of payload indexes in a collection",
             "type": "integer",
             "format": "uint",
-            "minimum": 1,
+            "minimum": 0,
             "nullable": true
           }

10005-10012: Input schema: wording only (keep minimum: 1)

Retain minimum: 1 for the config (input) to match existing strict-mode patterns, but fix the description plural.

-          "max_payload_index_count": {
-            "description": "Max number of payload index in a collection",
+          "max_payload_index_count": {
+            "description": "Max number of payload indexes in a collection",
             "type": "integer",
             "format": "uint",
             "minimum": 1,
             "nullable": true
           }

Please confirm if allowing 0 in the output is intended (to represent “no payload indexes allowed”). If not, we can keep minimum: 1 in both places for symmetry.

lib/api/src/grpc/proto/collections.proto (1)

372-378: Document semantics: scope, inclusivity, and enforcement timing for max_payload_index_count.

Please add a brief comment clarifying:

  • Scope: per-collection total across all fields/shards.
  • Inclusivity: whether the limit is inclusive (e.g., equals allowed).
  • Enforcement: checked only on creation-time; existing indexes aren’t auto-dropped if the limit is later lowered.
  • Zero/negative semantics: 0 should be treated as “unlimited” or rejected.

Suggested proto doc:

   optional uint64 max_points_count = 18;
-  optional uint64 max_payload_index_count = 19;
+  // Maximum number of payload indexes allowed in this collection (across all fields/shards).
+  // If unset -> no limit. If set, the limit is inclusive (creation of the N+1-th index is rejected).
+  // Enforcement occurs at payload-index creation time; existing indexes are not removed if the limit decreases.
+  // Recommended: treat 0 as invalid and reject it server-side.
+  optional uint64 max_payload_index_count = 19;
tests/openapi/test_strictmode.py (1)

2467-2470: Optional: assert config reflects the increased limit.

Add a quick GET/assert after bumping to 2 to verify propagation.

     set_strict_mode(collection_name, {
         "enabled": True,
         "max_payload_index_count": 2,
     })
+    resp = request_with_validation(
+        api='/collections/{collection_name}', method="GET",
+        path_params={'collection_name': collection_name},
+    )
+    assert resp.ok
+    assert resp.json()['result']['config']['strict_mode_config']['max_payload_index_count'] == 2
src/common/update.rs (2)

813-820: Avoid lossy cast; saturate Duration→usize seconds conversion.

Future non-None timeouts could truncate on 32-bit/large durations.

Apply this diff:

-    let wait_timeout: Option<Duration> = None; // ToDo: make it configurable
+    let wait_timeout: Option<Duration> = None; // ToDo: make it configurable
+    let timeout_secs = wait_timeout.map(|d| d.as_secs().min(usize::MAX as u64) as usize);
@@
-    let pass = check_strict_mode(
+    let pass = check_strict_mode(
         &operation,
-        wait_timeout.map(|d| d.as_secs() as usize),
+        timeout_secs,
         &collection_name,
         &dispatcher,
         &access,
     )

826-831: Fail fast on missing field_schema before strict-mode check.

This returns a precise 400 earlier and skips an unnecessary info fetch/strict check.

Apply this diff:

-    let Some(field_schema) = operation.field_schema else {
-        return Err(StorageError::bad_request(
-            "Can't auto-detect field type, please specify `field_schema` in the request",
-        ));
-    };
+    let field_schema = operation
+        .field_schema
+        .as_ref()
+        .ok_or_else(|| StorageError::bad_request(
+            "Can't auto-detect field type, please specify `field_schema` in the request",
+        ))?
+        .clone();
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 592972f and dad8e66.

📒 Files selected for processing (9)
  • docs/grpc/docs.md (1 hunks)
  • docs/redoc/master/openapi.json (2 hunks)
  • lib/api/src/grpc/conversions.rs (6 hunks)
  • lib/api/src/grpc/proto/collections.proto (1 hunks)
  • lib/api/src/grpc/qdrant.rs (1 hunks)
  • lib/segment/src/types.rs (6 hunks)
  • lib/storage/src/content_manager/conversions.rs (2 hunks)
  • src/common/update.rs (3 hunks)
  • tests/openapi/test_strictmode.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/segment/src/types.rs
  • lib/api/src/grpc/qdrant.rs
  • lib/storage/src/content_manager/conversions.rs
  • src/common/update.rs
  • lib/api/src/grpc/conversions.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/segment/src/types.rs
  • lib/api/src/grpc/qdrant.rs
  • lib/storage/src/content_manager/conversions.rs
  • src/common/update.rs
  • lib/api/src/grpc/conversions.rs
🧠 Learnings (1)
📚 Learning: 2025-08-10T18:30:02.986Z
Learnt from: generall
PR: qdrant/qdrant#7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.

Applied to files:

  • src/common/update.rs
🧬 Code graph analysis (2)
tests/openapi/test_strictmode.py (1)
tests/openapi/helpers/helpers.py (1)
  • request_with_validation (41-116)
src/common/update.rs (5)
lib/collection/src/operations/verification/mod.rs (8)
  • check_strict_mode (154-166)
  • check_custom (41-47)
  • check_custom (246-259)
  • check_custom (262-279)
  • query_limit (50-50)
  • query_limit (285-287)
  • request_exact (62-62)
  • request_exact (281-283)
src/common/strict_mode.rs (4)
  • check_strict_mode (14-20)
  • check_strict_mode (47-56)
  • check_strict_mode (87-97)
  • check_strict_mode (137-146)
lib/collection/src/collection/collection_ops.rs (1)
  • strict_mode_config (293-299)
lib/collection/src/operations/types.rs (1)
  • strict_mode (1084-1087)
src/tonic/api/collections_api.rs (1)
  • wait_timeout (320-320)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: e2e-tests
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: integration-tests-consensus
  • GitHub Check: test-consensus-compose
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
🔇 Additional comments (11)
lib/segment/src/types.rs (2)

1060-1080: Included in Hash — good.

New field participates in Hash; consistent with equality/merging semantics.


1201-1224: Conversion wiring looks correct.

max_payload_index_count is propagated in From for StrictModeConfigOutput.

lib/storage/src/content_manager/conversions.rs (2)

144-146: Field is correctly threaded through strict-mode conversions.

Destructuring and mapping of max_payload_index_count to usize looks consistent with existing fields. No action required.

Also applies to: 165-166


144-146: Verify max_payload_index_count in storage‐side conversions
No occurrences of the new field were detected via search—manually confirm that in lib/storage/src/content_manager/conversions.rs both the pattern-matching block (around lines 144–146) and the reverse mapping block (around 165–166) include max_payload_index_count.

tests/openapi/test_strictmode.py (1)

2417-2520: LGTM: scenario covers enable → enforce → relax → enforce → disable.

Happy path and failure cases look solid; waits are set on index ops; cleanup handled by fixture.

lib/api/src/grpc/conversions.rs (3)

2141-2186: Wire-up from gRPC StrictModeConfig → segment types is correct.

Field is destructured and mapped to usize consistently with peers.


2266-2309: Segment StrictModeConfigOutput → gRPC StrictModeConfig mapping includes the new field.

Casts match neighboring fields; no issues spotted.


2313-2358: gRPC StrictModeConfig → segment StrictModeConfigOutput mapping includes the new field.

Round-trip symmetry maintained.

src/common/update.rs (3)

2-2: LGTM on the new imports.

Imports are minimal and justified by the added strict-mode checks and timeout handling.

Also applies to: 10-10, 17-17, 22-22


200-219: Trait stubs returning None are appropriate.

No extra verification hooks are needed for CreateFieldIndex beyond the custom check.


813-825: Incorrect review – apply-time strict-mode enforcement already implemented.

ContentManager runs check_strict_mode on every CollectionUpdateOperations (including FieldIndexOperations::CreateIndex) in collection_verification.rs before applying changes, so the race described is prevented.

Likely an incorrect or invalid review comment.

optional StrictModeMultivectorConfig multivector_config = 16;
optional StrictModeSparseConfig sparse_config = 17;
optional uint64 max_points_count = 18;
optional uint64 max_payload_index_count = 19;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Confirm 0-handling and usize conversion downstream.

Downstream code maps this field to Option. On 32-bit targets, a u64→usize cast can overflow; and “0” needs a defined policy. Please verify:

  • 0 is rejected (or normalized to None) consistently across conversions.
  • Conversions use try_from/capping instead of unchecked as casts.

You can quickly scan for unsafe casts and usages:


🏁 Script executed:

#!/bin/bash
# Locate all occurrences and look for risky u64->usize casts and 0-handling.
rg -n "max_payload_index_count" -C3
rg -nP "as\s+usize" -C2 | rg -n "max_payload_index_count|StrictModeConfig"
rg -nP "try_from\s*<\s*usize\s*>" -C2 | rg -n "max_payload_index_count|StrictModeConfig"
# Check tests assert behavior for 0, unset, and enforcement on create.
rg -n "strict_mode_payload_index_count|max_payload_index_count" -g "!**/target/**" -C3

Length of output: 12455


Enforce safe usize conversion and explicit 0-value policy

  • In lib/storage/src/content_manager/conversions.rs (around the max_payload_index_count.map(|i| i as usize) on line 165) and in lib/api/src/grpc/conversions.rs (around lines 2185 and 2357), replace unchecked as usize casts with usize::try_from(i) (or .try_into()) and propagate or cap errors instead of truncating silently.
  • Define a clear policy for max_payload_index_count = 0: either reject it at the API boundary (e.g. validation error) or treat it as None consistently, and add a test in tests/openapi/test_strictmode.py covering the zero-value scenario.

Comment on lines +713 to +714
#[prost(uint64, optional, tag = "19")]
pub max_payload_index_count: ::core::option::Option<u64>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Backwards-compatible addition; clarify zero vs None semantics and verify conversions

  • Appending max_payload_index_count: Option<u64> at tag 19 preserves wire compatibility; looks good.
  • Please clarify whether Some(0) is a valid “no limit” value or if “no limit” should be represented by None. If 0 is invalid, consider adding a validation min of 1 in the proto so it flows into generated code.
  • Double-check all conversion paths map this safely to internal types. If you cast to usize (as AI summary suggests), prefer usize::try_from(i) with proper handling to avoid silent overflow on non-64-bit targets.

Run this script to verify coverage and casts:


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "1) Where is max_payload_index_count referenced?"
rg -n -C2 --hidden --glob '!.git' 'max_payload_index_count'

echo
echo "2) Check for potentially unsafe u64 -> usize casts:"
rg -n -C3 --hidden --glob '!.git' -t rs 'max_payload_index_count.*as\s+usize' || true

echo
echo "3) Ensure strict-mode enforcement on CreateFieldIndex exists:"
rg -n -C3 --hidden --glob '!.git' -t rs 'impl\s+StrictModeVerification\s+for\s+.*CreateFieldIndex' || true

echo
echo "4) Confirm OpenAPI/Docs exposure:"
rg -n -C1 --hidden --glob '!.git' 'max_payload_index_count' docs || true

Length of output: 196


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "1) Find all occurrences of max_payload_index_count:"
rg -n 'max_payload_index_count' .

echo
echo "2) Search for u64 -> usize casts involving max_payload_index_count:"
rg -nP 'max_payload_index_count.*as\s+usize' -t rs .

echo
echo "3) Check for StrictModeVerification implementation on CreateFieldIndex:"
rg -nP 'impl\s+StrictModeVerification\s+for\s+CreateFieldIndex' -t rs .

echo
echo "4) Search OpenAPI/docs for max_payload_index_count:"
rg -n 'max_payload_index_count' -g 'docs/**' .

Length of output: 2460


Clarify zero vs None semantics and replace unsafe casts

  • Currently Some(0) is treated as a zero‐limit (disallowing any payload indices); if unlimited should be None, add a min = 1 validation in the proto or map 0None in conversions.
  • Conversions in lib/storage/src/content_manager/conversions.rs:165, lib/api/src/grpc/conversions.rs:2185 & 2357 use i as usize; switch to usize::try_from(i)? (or equivalent) to prevent silent overflow on 32‐bit targets.
  • Strict-mode enforcement in src/common/update.rs:186–191 already guards against exceeding the limit; ensure it aligns with the chosen zero vs None behavior.
  • Update OpenAPI (docs/redoc/master/openapi.json) and gRPC docs (docs/grpc/docs.md) to document the semantics of max_payload_index_count.
🤖 Prompt for AI Agents
In lib/api/src/grpc/qdrant.rs around lines 713–714, the proto field
max_payload_index_count currently allows Some(0) which is ambiguous; change
semantics so 0 represents "unset/unlimited" (map 0→None in conversion) or
enforce min=1 in the proto, and update conversions to map 0→None if choosing
mapping approach. In addition, replace unsafe casts using i as usize in
lib/storage/src/content_manager/conversions.rs (around line 165) and
lib/api/src/grpc/conversions.rs (around lines 2185 and 2357) with checked
conversions (e.g., usize::try_from(...) and propagate/convert the error) to
avoid silent overflows on 32‑bit targets. Ensure strict‑mode enforcement in
src/common/update.rs (lines ~186–191) remains consistent with the chosen zero vs
None behavior (adjust checks to treat None as unlimited or 0 as disallowed), and
update docs: docs/redoc/master/openapi.json and docs/grpc/docs.md to clearly
document the final semantics of max_payload_index_count.

Comment on lines +186 to +197
if let Some(max_payload_index_count) = strict_mode_config.max_payload_index_count {
let collection_info = collection.info(&ShardSelectorInternal::All).await?;
if collection_info.payload_schema.len() >= max_payload_index_count {
return Err(CollectionError::strict_mode(
format!(
"Collection already has the maximum number of payload indices ({max_payload_index_count})"
),
"Please delete an existing index before creating a new one.",
));
}
}
Ok(())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

❓ Verification inconclusive

Don’t block re-indexing the same field; count only net-new indexes.

Today the check rejects any create when count >= limit, even if the target field is already indexed (no net increase). This can produce unnecessary StrictMode errors and prevents changing the schema of an existing index at the limit.

Refine the condition to allow updates on an already indexed field and improve the error to include current/limit for faster debugging.

Apply this diff:

-        if let Some(max_payload_index_count) = strict_mode_config.max_payload_index_count {
-            let collection_info = collection.info(&ShardSelectorInternal::All).await?;
-            if collection_info.payload_schema.len() >= max_payload_index_count {
-                return Err(CollectionError::strict_mode(
-                    format!(
-                        "Collection already has the maximum number of payload indices ({max_payload_index_count})"
-                    ),
-                    "Please delete an existing index before creating a new one.",
-                ));
-            }
-        }
+        if let Some(max_payload_index_count) = strict_mode_config.max_payload_index_count {
+            let collection_info = collection.info(&ShardSelectorInternal::All).await?;
+            let current = collection_info.payload_schema.len();
+            // Allow re-creating/updating an index on the same field without increasing the count
+            let already_indexed = collection_info
+                .payload_schema
+                .contains_key(&self.field_name);
+            if !already_indexed && current >= max_payload_index_count {
+                return Err(CollectionError::strict_mode(
+                    format!(
+                        "Collection would exceed the maximum payload index count: current={current}, limit={max_payload_index_count}"
+                    ),
+                    "Delete an existing index or raise the limit.",
+                ));
+            }
+        }

If payload_schema can contain non-indexed types (e.g., inferred schema without an index), we should count only indexed entries. Please confirm the semantics of payload_schema in CollectionInfo.


Refine strict_mode index count logic to allow re-indexing the same field
payload_schema contains only indexed entries, so its len() equals the number of payload indices. Checking contains_key(&self.field_name) and only erroring when adding a net-new index correctly enforces the limit. Apply the proposed diff.

🤖 Prompt for AI Agents
In src/common/update.rs around lines 186 to 197, the strict_mode check currently
blocks creating an index whenever payload_schema.len() >=
max_payload_index_count, preventing re-indexing the same field; change the logic
to allow re-indexing by first checking if the field already exists in
collection_info.payload_schema (contains_key(&self.field_name)) and only return
the strict_mode error when the field is not present AND payload_schema.len() >=
max_payload_index_count. Keep the same error message and help text; ensure you
await collection.info(...) as before and only perform the contains_key check
against the returned payload_schema before deciding to Err.

@agourlay agourlay force-pushed the strict-mode-max-payload-index-count branch from dad8e66 to 95d94e3 Compare September 15, 2025 09:18
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 15, 2025

📝 Walkthrough

Walkthrough

Adds a new optional field max_payload_index_count to StrictModeConfig across proto, gRPC types, OpenAPI/Redoc docs, internal segment/storage types, and conversion code. Introduces nested strict-mode types for multivector and sparse configurations with validations. Threads the new field through serialization, hashing, and validation registration. Implements a StrictModeVerification check for CreateFieldIndex that enforces max_payload_index_count and integrates it into the index-creation path. Adds an integration test exercising enforcement, dynamic limit updates, and behavior when strict mode is disabled.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • coszio
  • JojiiOfficial
  • ffuugoo

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title references strict-mode and adding a max payload index count, which accurately reflects the primary change (adding and enforcing max_payload_index_count when creating payload indexes). The phrasing is awkward and redundant ("number of" and "count"), so it could be tightened for clarity.
Description Check ✅ Passed The PR description is on-topic and explains the new strict-mode configuration field, when the check is applied, and the implementation tradeoff of fetching full collection info; it therefore relates directly to the changeset. It is concise but sufficiently informative for this lenient check.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch strict-mode-max-payload-index-count

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between be30ad5 and 6805fd8.

📒 Files selected for processing (1)
  • tests/openapi/test_strictmode.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/openapi/test_strictmode.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: e2e-tests
  • GitHub Check: test-consistency
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: integration-tests
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: lint

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (5)
lib/api/src/grpc/qdrant.rs (1)

713-714: Clarify 0 vs None semantics and add min=1 validation; ensure safe usize conversion elsewhere

  • Decide whether Some(0) means “no indices allowed” or if “unlimited” should be None. If 0 is invalid, add a min=1 validator so it matches other strict-mode numeric caps.
  • Please add the validator in the source proto/generator (not this generated file) so the generated Rust includes it like below.
  • In conversions mapping this field to internal usize, replace as usize with usize::try_from(...) to avoid silent overflow on 32‑bit targets.

Apply via proto/generator so the generated Rust contains (illustrative):

-    #[prost(uint64, optional, tag = "19")]
-    pub max_payload_index_count: ::core::option::Option<u64>,
+    #[prost(uint64, optional, tag = "19")]
+    #[validate(range(min = 1))]
+    pub max_payload_index_count: ::core::option::Option<u64>,

Run to confirm no unsafe casts remain and to see all call sites:

#!/bin/bash
set -euo pipefail

echo "Occurrences of max_payload_index_count:"
rg -n -C2 --hidden --glob '!.git' 'max_payload_index_count'

echo
echo "Potentially unsafe u64 -> usize casts near this field:"
rg -n -C3 --hidden --glob '!.git' -t rs 'max_payload_index_count.*as\s+usize' || true
lib/segment/src/types.rs (1)

1030-1034: Allow 0 (to fully forbid indices) and fix pluralization.

Align with prior feedback: a minimum of 0 is useful to entirely prevent payload index creation. Also fix “index” → “indices”.

Apply this diff in both structs:

-    /// Max number of payload index in a collection
+    /// Max number of payload indices in a collection (0 forbids creating any payload indices)
     #[serde(skip_serializing_if = "Option::is_none")]
-    #[validate(range(min = 1))]
+    #[validate(range(min = 0))]
     pub max_payload_index_count: Option<usize>,

Also applies to: 1174-1178

src/common/update.rs (1)

180-199: Don’t block re‑indexing the same field; count only net‑new indexes.

If the target field is already indexed, allow the request even at the limit. Also improve the error with current/limit.

Apply:

-        if let Some(max_payload_index_count) = strict_mode_config.max_payload_index_count {
-            let collection_info = collection.info(&ShardSelectorInternal::All).await?;
-            if collection_info.payload_schema.len() >= max_payload_index_count {
-                return Err(CollectionError::strict_mode(
-                    format!(
-                        "Collection already has the maximum number of payload indices ({max_payload_index_count})"
-                    ),
-                    "Please delete an existing index before creating a new one.",
-                ));
-            }
-        }
+        if let Some(max_payload_index_count) = strict_mode_config.max_payload_index_count {
+            let collection_info = collection.info(&ShardSelectorInternal::All).await?;
+            let current = collection_info.payload_schema.len();
+            let already_indexed = collection_info
+                .payload_schema
+                .contains_key(&self.field_name);
+            if !already_indexed && current >= max_payload_index_count {
+                return Err(CollectionError::strict_mode(
+                    format!(
+                        "Collection would exceed the maximum payload index count: current={current}, limit={max_payload_index_count}"
+                    ),
+                    "Delete an existing index or raise the limit.",
+                ));
+            }
+        }
docs/grpc/docs.md (1)

1599-1600: Add missing description and fix pluralization for new field.

Use: “Max number of payload indexes in a collection.” Also update the proto comment (StrictModeConfig.max_payload_index_count) so this doc regenerates consistently.

-| max_payload_index_count | [uint64](#uint64) | optional |  |
+| max_payload_index_count | [uint64](#uint64) | optional | Max number of payload indexes in a collection. |
tests/openapi/test_strictmode.py (1)

2464-2464: Fix leading space in asserted error substrings (duplicate of prior review).

Leading space can cause false negatives; remove it.

-    assert " Collection already has the maximum number of payload indices (1). Help: Please delete an existing index before creating a new one." in response.json()['status']['error']
+    assert "Collection already has the maximum number of payload indices (1). Help: Please delete an existing index before creating a new one." in response.json()['status']['error']
@@
-    assert " Collection already has the maximum number of payload indices (2). Help: Please delete an existing index before creating a new one." in response.json()['status']['error']
+    assert "Collection already has the maximum number of payload indices (2). Help: Please delete an existing index before creating a new one." in response.json()['status']['error']

Also applies to: 2502-2502

🧹 Nitpick comments (4)
lib/api/src/grpc/proto/collections.proto (1)

377-378: Clarify semantics, fix wording, and document zero-value; verify safe downstream casts

  • Please add a brief comment clarifying behavior: applies only when strict mode is enabled; define what 0 means (reject vs unlimited) and keep this consistent across API, conversions, and tests.
  • Align wording with docs: use “payload indexes” (or “indices”) consistently.
  • Downstream maps this to Option; ensure u64→usize uses try_from/try_into (no unchecked as) and add a test for 0, None, and 32‑bit overflow cases.

Suggested proto tweak:

-  optional uint64 max_payload_index_count = 19;
+  optional uint64 max_payload_index_count = 19; // Max number of payload indexes per collection (effective only if strict mode is enabled). Define 0 semantics explicitly.
lib/storage/src/content_manager/conversions.rs (1)

165-166: Avoid lossy u64→usize cast (possible truncation on 32‑bit).

Consider a checked conversion to prevent silent overflow on 32‑bit targets.

-        max_payload_index_count: max_payload_index_count.map(|i| i as usize),
+        max_payload_index_count: max_payload_index_count.and_then(|i| usize::try_from(i).ok()),

If Qdrant only targets 64‑bit, please confirm and we can keep as‑casts for consistency.

docs/redoc/master/openapi.json (2)

7452-7458: Add max_payload_index_count to StrictModeConfigOutput — LGTM with a tiny wording fix

Schema, type, and constraints look consistent with the rest of strict-mode fields. Nit: pluralize “index” in the description for clarity.

-          "max_payload_index_count": {
-            "description": "Max number of payload index in a collection",
+          "max_payload_index_count": {
+            "description": "Max number of payload indexes in a collection",
             "type": "integer",
             "format": "uint",
             "minimum": 1,
             "nullable": true
           }

10006-10013: Apply pluralization fix to max_payload_index_count description

OpenAPI checks passed; update the description in docs/redoc/master/openapi.json (StrictModeConfig and StrictModeConfigOutput) to "Max number of payload indexes in a collection".

-          "max_payload_index_count": {
-            "description": "Max number of payload index in a collection",
+          "max_payload_index_count": {
+            "description": "Max number of payload indexes in a collection",
             "type": "integer",
             "format": "uint",
             "minimum": 1,
             "nullable": true
           }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4c4a7cd and 95d94e3.

📒 Files selected for processing (9)
  • docs/grpc/docs.md (1 hunks)
  • docs/redoc/master/openapi.json (2 hunks)
  • lib/api/src/grpc/conversions.rs (6 hunks)
  • lib/api/src/grpc/proto/collections.proto (1 hunks)
  • lib/api/src/grpc/qdrant.rs (1 hunks)
  • lib/segment/src/types.rs (6 hunks)
  • lib/storage/src/content_manager/conversions.rs (2 hunks)
  • src/common/update.rs (3 hunks)
  • tests/openapi/test_strictmode.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/storage/src/content_manager/conversions.rs
  • lib/api/src/grpc/qdrant.rs
  • lib/segment/src/types.rs
  • src/common/update.rs
  • lib/api/src/grpc/conversions.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/storage/src/content_manager/conversions.rs
  • lib/api/src/grpc/qdrant.rs
  • lib/segment/src/types.rs
  • src/common/update.rs
  • lib/api/src/grpc/conversions.rs
🧠 Learnings (2)
📚 Learning: 2025-08-15T11:42:00.297Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.

Applied to files:

  • lib/api/src/grpc/proto/collections.proto
📚 Learning: 2025-08-10T18:30:02.986Z
Learnt from: generall
PR: qdrant/qdrant#7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.

Applied to files:

  • lib/api/src/grpc/proto/collections.proto
  • lib/api/src/grpc/qdrant.rs
  • src/common/update.rs
🧬 Code graph analysis (2)
tests/openapi/test_strictmode.py (2)
tests/openapi/conftest.py (1)
  • collection_name (50-51)
tests/openapi/helpers/helpers.py (1)
  • request_with_validation (41-116)
src/common/update.rs (4)
lib/collection/src/operations/verification/mod.rs (8)
  • check_strict_mode (154-166)
  • check_custom (41-47)
  • check_custom (246-259)
  • check_custom (262-279)
  • query_limit (50-50)
  • query_limit (285-287)
  • request_exact (62-62)
  • request_exact (281-283)
src/common/strict_mode.rs (4)
  • check_strict_mode (14-20)
  • check_strict_mode (47-56)
  • check_strict_mode (87-97)
  • check_strict_mode (137-146)
lib/collection/src/operations/types.rs (1)
  • strict_mode (1069-1072)
src/tonic/api/collections_api.rs (1)
  • wait_timeout (320-320)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: e2e-tests
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: test-consistency
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: lint
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests (ubuntu-latest)
🔇 Additional comments (9)
lib/api/src/grpc/qdrant.rs (1)

713-714: Wire-compatible optional field addition looks good

New optional tag 19 is backward compatible and well placed after 18.

lib/segment/src/types.rs (1)

1060-1061: Propagation and hashing look correct.

Field is properly included in destructuring, hashing, and output conversion.

Also applies to: 1079-1080, 1201-1202, 1223-1224

src/common/update.rs (2)

2-2: Imports change LGTM.

New imports are required by the strict‑mode additions.

Also applies to: 10-10, 17-17, 22-23


813-825: Pre‑consensus strict‑mode check is good.

This gates index creation early and keeps the flow consistent with other updates.

lib/storage/src/content_manager/conversions.rs (1)

144-145: LGTM: field is correctly threaded from API.

Destructuring and plumbing of max_payload_index_count look correct.

tests/openapi/test_strictmode.py (1)

2417-2520: Add payload-index-count strict-mode test: LGTM

Covers initial limit, dynamic increase, and disable path. Good use of wait=true to avoid flakiness.

lib/api/src/grpc/conversions.rs (3)

2141-2187: Propagate max_payload_index_count (API → internal): LGTM

Casts to usize mirror existing fields. Confirm we don’t target 32‑bit builds; u64→usize cast would truncate there.


2266-2310: Round‑trip to gRPC output includes new field: LGTM

Types and casts (usize→u64) are consistent with adjacent fields.


2313-2359: Propagation into StrictModeConfigOutput (internal): LGTM

Symmetric with other conversions; no gaps spotted.

Comment on lines +813 to +825
// Default consensus timeout will be used
let wait_timeout: Option<Duration> = None; // ToDo: make it configurable

// Check strict mode before submitting consensus operation
let pass = check_strict_mode(
&operation,
wait_timeout.map(|d| d.as_secs() as usize),
&collection_name,
&dispatcher,
&access,
)
.await?;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Race window: enforce the limit at apply time too.

Two concurrent creates can both pass the pre‑check and exceed the limit after sequencing. Add the same check in the apply path of CollectionMetaOperations::CreatePayloadIndex (storage/content_manager) so the second op fails deterministically post‑ordering.

Also applies to: 180-199

🤖 Prompt for AI Agents
In src/common/update.rs around lines 813-825 (and similarly for the block at
lines 180-199), the pre-check for collection limits can race: two concurrent
CreatePayloadIndex ops can both pass here and later exceed the limit after
sequencing; add the same limit enforcement in the apply path inside
CollectionMetaOperations::CreatePayloadIndex in storage/content_manager so the
second op fails deterministically post-ordering. Implement this by re-checking
the collection state under the same mutation/lock used when applying operations
(or by calling the shared check_strict_mode logic or a new helper that validates
current metadata and capacity) and return the appropriate error if applying the
index would exceed the limit; ensure the check runs immediately before mutating
the meta, uses the same parameters as the pre-check, and add/update unit tests
to cover the race case.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
tests/openapi/test_strictmode.py (1)

2417-2520: Good coverage for max_payload_index_count; assertions look stable.

Checks enable→enforce→raise limit→enforce→disable flow; fixed prior leading-space issue in error substrings.

Consider adding a follow-up when policy changes: a test for max_payload_index_count=0 (reject or “disabled”) to lock semantics.

lib/api/src/grpc/qdrant.rs (1)

673-734: Strict-mode limits: add validation, clarify 0 vs None semantics, and fix one docstring

  • Add validation to prevent accidental “disable everything” by setting 0:
    • max_payload_index_count, upsert_max_batchsize, search_max_hnsw_ef ≥ 1
    • search_max_oversampling ≥ 1.0
  • Clarify whether 0 means “disabled/unlimited” or is invalid for all counters (max_query_limit, max_timeout, read_rate_limit, write_rate_limit, max_payload_index_count). If 0 is invalid, add min validations; if it means “unlimited,” document it explicitly.
  • Docstring at Line 711 describes “vector storage” but the field is max_collection_payload_size_bytes. Fix the text to “payload storage.”

Apply via proto/build so prost emits the attributes; illustrative diff on the generated struct:

     /// Max HNSW ef value allowed in search parameters.
     #[prost(uint32, optional, tag = "6")]
+    #[validate(range(min = 1))]
     pub search_max_hnsw_ef: ::core::option::Option<u32>,
@@
     /// Max oversampling value allowed in search
     #[prost(float, optional, tag = "8")]
+    #[validate(range(min = 1.0))]
     pub search_max_oversampling: ::core::option::Option<f32>,
@@
     /// Max batchsize when upserting
     #[prost(uint64, optional, tag = "9")]
+    #[validate(range(min = 1))]
     pub upsert_max_batchsize: ::core::option::Option<u64>,
@@
-    /// Max size of a collections vector storage in bytes, ignoring replicas.
+    /// Max size of a collection's payload storage in bytes, ignoring replicas.
     #[prost(uint64, optional, tag = "13")]
     pub max_collection_payload_size_bytes: ::core::option::Option<u64>,
@@
     /// Max number of payload indexes in a collection
     #[prost(uint64, optional, tag = "19")]
+    #[validate(range(min = 1))]
     pub max_payload_index_count: ::core::option::Option<u64>,
🧹 Nitpick comments (4)
lib/api/build.rs (1)

39-54: Use explicit From per guideline (avoid Into::into).

Follow repo style: prefer SomeType::from(x) over x.into().

-        option_env!("GIT_COMMIT_ID").map(Into::into).or_else(|| {
+        option_env!("GIT_COMMIT_ID").map(String::from).or_else(|| {
lib/api/src/grpc/proto/collections.proto (2)

370-370: Fix comment: payload vs vector storage.

The field is max_collection_payload_size_bytes; the comment mentions “vector storage.”

-  optional uint64 max_collection_payload_size_bytes = 13; // Max size of a collections vector storage in bytes, ignoring replicas.
+  optional uint64 max_collection_payload_size_bytes = 13; // Max size of a collection's payload storage in bytes, ignoring replicas.

376-376: Wording nit: “indexes” vs “indices.”

Tests and errors use “indices.” Consider unifying wording across API docs/comments.

docs/redoc/master/openapi.json (1)

7452-7459: Add StrictModeConfigOutput.max_payload_index_count — looks good; regen from source

Schema shape (integer, uint, min=0, nullable) is fine. Please ensure this JSON was regenerated from the source-of-truth (lib/api/src/rest/schema.rs and/or proto), not edited directly. Also, align nearby description wording with the input schema for consistency:

  • search_max_hnsw_ef: Output currently reads “Max HNSW value…”; input says “Max HNSW ef value…”.
  • multivector_config/sparse_config: Output uses “configuration”; input uses “strict mode configuration”.
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 95d94e3 and beb689f.

📒 Files selected for processing (7)
  • docs/grpc/docs.md (3 hunks)
  • docs/redoc/master/openapi.json (5 hunks)
  • lib/api/build.rs (1 hunks)
  • lib/api/src/grpc/proto/collections.proto (1 hunks)
  • lib/api/src/grpc/qdrant.rs (2 hunks)
  • lib/segment/src/types.rs (7 hunks)
  • tests/openapi/test_strictmode.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • lib/segment/src/types.rs
  • docs/grpc/docs.md
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/api/build.rs
  • lib/api/src/grpc/qdrant.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/api/src/grpc/qdrant.rs
🧠 Learnings (4)
📚 Learning: 2025-08-15T11:42:00.297Z
Learnt from: IvanPleshkov
PR: qdrant/qdrant#7043
File: lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs:86-90
Timestamp: 2025-08-15T11:42:00.297Z
Learning: In lib/segment/src/vector_storage/quantized/quantized_mmap_storage.rs, overflow protection for encoded_storage_size computation (quantized_vector_size * vectors_count) is implemented in PR #7048, using checked_mul with u64 casting to prevent silent overflow.

Applied to files:

  • lib/api/src/grpc/proto/collections.proto
📚 Learning: 2025-08-10T18:30:02.986Z
Learnt from: generall
PR: qdrant/qdrant#7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.

Applied to files:

  • lib/api/src/grpc/proto/collections.proto
  • lib/api/src/grpc/qdrant.rs
📚 Learning: 2025-08-10T18:26:12.443Z
Learnt from: generall
PR: qdrant/qdrant#7006
File: docs/redoc/master/openapi.json:13645-13652
Timestamp: 2025-08-10T18:26:12.443Z
Learning: In qdrant, docs/redoc/master/openapi.json is auto-generated from the REST schemas. To change field docs, edit lib/api/src/rest/schema.rs (e.g., add doc comments or #[schemars(description = ...)]). Specifically, UpdateVectors.update_filter lacked a description and should state: "If specified, only update vectors for points that match this filter; points not matching the filter are left unchanged."

Applied to files:

  • docs/redoc/master/openapi.json
📚 Learning: 2025-08-10T18:26:33.017Z
Learnt from: generall
PR: qdrant/qdrant#7006
File: docs/redoc/master/openapi.json:10393-10401
Timestamp: 2025-08-10T18:26:33.017Z
Learning: In qdrant, docs/redoc/master/openapi.json is auto-generated. Any description changes must be made in the source (e.g., lib/api/src/rest/schema.rs and/or lib/api/src/grpc/proto/points.proto), not edited directly in the JSON.

Applied to files:

  • docs/redoc/master/openapi.json
🧬 Code graph analysis (2)
tests/openapi/test_strictmode.py (2)
tests/openapi/conftest.py (1)
  • collection_name (50-51)
tests/openapi/helpers/helpers.py (1)
  • request_with_validation (41-116)
lib/api/src/grpc/qdrant.rs (1)
lib/api/src/grpc/validate.rs (8)
  • validate (14-14)
  • validate (19-21)
  • validate (29-31)
  • validate (39-41)
  • validate (49-54)
  • validate (58-64)
  • validate (68-74)
  • validate (78-85)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: integration-tests-consensus
  • GitHub Check: integration-tests
  • GitHub Check: e2e-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: test-consistency
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: lint
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: storage-compat-test
  • GitHub Check: rust-tests (ubuntu-latest)
🔇 Additional comments (6)
docs/redoc/master/openapi.json (1)

9912-9920: Add StrictModeConfig.max_payload_index_count — LGTM; confirm semantics and type mapping

  • Min=0 allows forbidding payload indexes entirely; confirm that’s intended.
  • OpenAPI uses "format": "uint" while proto field is uint64; verify our generator maps this consistently across conversions and docs.
  • As this file is auto-generated, please verify the change lives in proto/REST schema and re-run OpenAPI generation.

Also applies to: 9986-10013

lib/api/src/grpc/qdrant.rs (5)

720-723: Nested validation for multivector_config looks good

LGTM; #[validate(nested)] is correct here.


724-727: Nested validation for sparse_config looks good

LGTM.


728-731: min=1 on max_points_count is appropriate

LGTM; consistent with intended semantics.


752-756: Sparse max_length validation is correct

LGTM; min=1 is sensible.


774-778: Multivector max_vectors validation is correct

LGTM.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
docs/grpc/docs.md (1)

1599-1600: Doc entry for max_payload_index_count looks good.

Description is concise and clear; aligns with earlier feedback.

lib/api/build.rs (1)

168-176: Missing validation for StrictModeConfig.max_payload_index_count (define >0 or 0-policy).

You added validators for several strict-mode fields but not for max_payload_index_count. Without validation, 0/negative handling is ambiguous across layers. Either enforce >0 here, or clearly define 0 as “disabled” and normalize it in conversions (and tests). This was flagged earlier and still applies.

Apply in this file:

             ("StrictModeConfig.max_points_count", "range(min = 1)"),
             ("StrictModeConfig.read_rate_limit", "range(min = 1)"),
             ("StrictModeConfig.write_rate_limit", "range(min = 1)"),
             ("StrictModeConfig.multivector_config", ""),
             ("StrictModeConfig.sparse_config", ""),
             ("StrictModeSparseConfig.sparse_config", ""),
             ("StrictModeSparse.max_length", "range(min = 1)"),
             ("StrictModeMultivectorConfig.multivector_config", ""),
             ("StrictModeMultivector.max_vectors", "range(min = 1)"),
+            ("StrictModeConfig.max_payload_index_count", "range(min = 1)"),

If 0 should mean “disabled,” don’t add min=1 here; instead, normalize 0→None in the API→internal conversions and document it.

Run to confirm absence/presence across the repo:

#!/bin/bash
rg -n "max_payload_index_count" -C2
rg -nP 'StrictModeConfig\.max_payload_index_count",\s*"range' lib/api/build.rs
lib/api/src/grpc/qdrant.rs (1)

673-737: Add validation (or document & normalize 0 vs None) for StrictModeConfig.max_payload_index_count.

Current state: the generated proto struct (lib/api/src/grpc/qdrant.rs) has no validation; lib/segment/src/types.rs allows 0 (#[validate(range(min = 0))]) and conversions simply cast Some(0) -> 0 (lib/storage/src/content_manager/conversions.rs, lib/api/src/grpc/conversions.rs). src/common/update.rs treats Some(0) as a hard zero-limit (blocking index creation). Either enforce min=1 across the API/type layers or explicitly document that 0 means “disable indices” and change conversions to map 0 -> None consistently.

Apply after regenerating protos (example diff):

-    #[prost(uint64, optional, tag = "19")]
-    pub max_payload_index_count: ::core::option::Option<u64>,
+    #[prost(uint64, optional, tag = "19")]
+    #[validate(range(min = 1))]
+    pub max_payload_index_count: ::core::option::Option<u64>,

Files to touch: lib/api/build.rs (add validation mapping and regenerate), lib/api/src/grpc/qdrant.rs (generated), lib/segment/src/types.rs (align min), and conversion modules (lib/storage/src/content_manager/conversions.rs, lib/api/src/grpc/conversions.rs) if you choose the “0 -> None” semantics.

🧹 Nitpick comments (2)
docs/grpc/docs.md (1)

1590-1590: Tiny wording/grammar polish for consistency.

Prefer these edits (and update the proto comments so regeneration keeps them consistent):

-| upsert_max_batchsize | [uint64](#uint64) | optional | Max batchsize when upserting |
+| upsert_max_batchsize | [uint64](#uint64) | optional | Max batch size when upserting. |

-| max_vectors | [uint64](#uint64) | optional | Max number of vectors in a multivector |
+| max_vectors | [uint64](#uint64) | optional | Max number of vectors in a multi-vector. |

-| max_length | [uint64](#uint64) | optional | Max length of sparse vector |
+| max_length | [uint64](#uint64) | optional | Max length of a sparse vector. |

Also applies to: 1614-1614, 1660-1660

lib/api/src/grpc/qdrant.rs (1)

697-699: Type consistency: consider double (f64) for search_max_oversampling.

QuantizationSearchParams.oversampling is f64; making this double/f64 keeps types aligned across the API. Safe to tweak now before release if wire compatibility is still flexible.

-    #[prost(float, optional, tag = "8")]
-    pub search_max_oversampling: ::core::option::Option<f32>,
+    #[prost(double, optional, tag = "8")]
+    pub search_max_oversampling: ::core::option::Option<f64>,
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between beb689f and ed959c3.

📒 Files selected for processing (4)
  • docs/grpc/docs.md (3 hunks)
  • lib/api/build.rs (2 hunks)
  • lib/api/src/grpc/proto/collections.proto (1 hunks)
  • lib/api/src/grpc/qdrant.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/api/src/grpc/proto/collections.proto
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • lib/api/build.rs
  • lib/api/src/grpc/qdrant.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • lib/api/src/grpc/qdrant.rs
🧠 Learnings (2)
📚 Learning: 2025-08-15T15:56:36.821Z
Learnt from: coszio
PR: qdrant/qdrant#7065
File: lib/collection/src/operations/universal_query/shard_query.rs:0-0
Timestamp: 2025-08-15T15:56:36.821Z
Learning: Internal gRPC APIs in Qdrant don't require additional runtime validation when proto-level validation is already configured via build.rs range constraints.

Applied to files:

  • lib/api/build.rs
📚 Learning: 2025-08-10T18:30:02.986Z
Learnt from: generall
PR: qdrant/qdrant#7006
File: lib/collection/src/operations/verification/update.rs:158-174
Timestamp: 2025-08-10T18:30:02.986Z
Learning: In Qdrant's strict mode verification code (lib/collection/src/operations/verification/update.rs), exhaustive pattern matching without `..` is intentionally used for structs like PointsBatch and PointsList. This design pattern ensures compilation fails when new fields are added, forcing developers to explicitly consider how new fields should be handled in the indexed_filter_write method. This provides visibility and compile-time safety for struct evolution.

Applied to files:

  • lib/api/src/grpc/qdrant.rs
🧬 Code graph analysis (2)
lib/api/build.rs (1)
lib/segment/src/types.rs (16)
  • from (139-141)
  • from (199-201)
  • from (255-260)
  • from (264-269)
  • from (807-809)
  • from (813-815)
  • from (819-821)
  • from (866-875)
  • from (879-882)
  • from (920-929)
  • from (941-944)
  • from (1181-1225)
  • from (1624-1626)
  • from (1630-1632)
  • from (1636-1638)
  • from (1651-1656)
lib/api/src/grpc/qdrant.rs (1)
lib/api/src/grpc/validate.rs (8)
  • validate (14-14)
  • validate (19-21)
  • validate (29-31)
  • validate (39-41)
  • validate (49-54)
  • validate (58-64)
  • validate (68-74)
  • validate (78-85)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: test-consensus-compose
  • GitHub Check: lint
  • GitHub Check: e2e-tests
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: test-consistency
  • GitHub Check: integration-tests
  • GitHub Check: integration-tests-consensus
  • GitHub Check: storage-compat-test
🔇 Additional comments (5)
lib/api/build.rs (1)

39-49: Env var mapping change is fine (explicit String::from).

Explicit String::from over Into::into matches our Rust guideline. No issues.

lib/api/src/grpc/qdrant.rs (4)

738-749: LGTM: nested validation on sparse_config map is correct.


750-759: LGTM: StrictModeSparse.max_length guarded with min=1.


760-771: LGTM: nested validation on multivector_config map is correct.


772-781: LGTM: StrictModeMultivector.max_vectors guarded with min=1.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
tests/openapi/test_strictmode.py (2)

2419-2434: Good boundary test for zero limit.

Including max_payload_index_count = 0 validates the lower bound and addresses the prior request to cover this case.


2451-2452: LGTM: error substring assertions no longer have stray leading spaces.

Matches the server message precisely and prevents false negatives noted earlier.

Also applies to: 2499-2500, 2537-2538

🧹 Nitpick comments (3)
tests/openapi/test_strictmode.py (3)

2435-2447: Fix misleading comment: no indices allowed here (max=0).

At this point the collection has 0 payload indexes; the failure is due to the limit being 0, not “1 index already”.

-    # should fail now with 1 index already
+    # should fail now because no payload indices are allowed (max=0)

2501-2507: Optionally assert config after raising the limit to 2.

Adds a quick GET to confirm the update before proceeding.

     set_strict_mode(collection_name, {
         "enabled": True,
         "max_payload_index_count": 2,
     })
 
+    # verify config reflects the new limit
+    response = request_with_validation(
+        api='/collections/{collection_name}',
+        method="GET",
+        path_params={'collection_name': collection_name},
+    )
+    assert response.ok
+    cfg = response.json()['result']['config']['strict_mode_config']
+    assert cfg['max_payload_index_count'] == 2

2539-2542: Optionally assert strict mode is disabled before the final create.

Tiny sanity check prior to the success case.

     set_strict_mode(collection_name, {
         "enabled": False,
     })
 
+    assert not strict_mode_enabled(collection_name)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ed959c3 and be30ad5.

📒 Files selected for processing (1)
  • tests/openapi/test_strictmode.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/openapi/test_strictmode.py (1)
tests/openapi/helpers/helpers.py (1)
  • request_with_validation (41-116)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: e2e-tests
  • GitHub Check: test-consensus-compose
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: integration-tests-consensus
  • GitHub Check: test-consistency
  • GitHub Check: integration-tests
  • GitHub Check: storage-compat-test
  • GitHub Check: lint

Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patches!

@agourlay agourlay merged commit bc13f01 into dev Sep 15, 2025
16 checks passed
@agourlay agourlay deleted the strict-mode-max-payload-index-count branch September 15, 2025 11:12
timvisee pushed a commit that referenced this pull request Sep 29, 2025
* [strict-mode] Add max number of payload index count

* improve docs and validation for StrictModeConfig

* fix from coderabbit

* test blocking access to payload indexes

* polish test
@timvisee timvisee mentioned this pull request Sep 29, 2025
@agourlay agourlay added this to the Strict mode milestone Nov 12, 2025
@timvisee timvisee mentioned this pull request Nov 14, 2025
@coderabbitai coderabbitai bot mentioned this pull request Jan 7, 2026
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants