Skip to content

Fix missing REST API validation for point batch update#7219

Merged
agourlay merged 2 commits intodevfrom
validate-batch-update-operations
Sep 5, 2025
Merged

Fix missing REST API validation for point batch update#7219
agourlay merged 2 commits intodevfrom
validate-batch-update-operations

Conversation

@agourlay
Copy link
Member

@agourlay agourlay commented Sep 5, 2025

The validation is completely missing on the point batch update operations.

The new test fails without the fix.

Backported from #7216

"update_filter": {
"must": [
{
"key": "empty" # empty field conditions are forbidden
Copy link
Member Author

@agourlay agourlay Sep 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a malformed filter is an easy test for the missing validation call.

But keep in mind that the entire Update enum in depth is not validated.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 5, 2025

📝 Walkthrough

Walkthrough

Added element-wise validation to UpdateOperations by annotating its operations: Vec<UpdateOperation> field with #[validate(nested)] in src/common/update.rs. Added a new test test_batch_update_validation in tests/openapi/test_batch_update_multivec.py that sends a batch update with an invalid empty field condition and asserts a 422 response containing validation error messages.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested reviewers

  • timvisee
  • coszio
  • generall
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch validate-batch-update-operations

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/openapi/test_batch_update_multivec.py (2)

46-50: Stabilize test by avoiding incidental vector-dimension errors

Use a 4D "image" vector (as elsewhere in this suite) so the failure is strictly due to the invalid field condition, not potential dimension checks if validation order changes.

Apply:

-                                "image": [1.0, 0.0, 9.0],
+                                "image": [1.0, 0.0, 9.0, 1.0],

65-68: Quieten Ruff S101 in tests and add failure context

Tests commonly use bare asserts; silence S101 locally and include response text for faster diagnosis.

Apply:

-    assert response.status_code == 422
+    assert response.status_code == 422, response.text  # noqa: S101
-    error = response.json()["status"]["error"]
-    assert "Validation error in JSON body" in error
-    assert "At least one field condition must be specified" in error
+    error = response.json()["status"]["error"]
+    assert "Validation error in JSON body" in error  # noqa: S101
+    assert "At least one field condition must be specified" in error  # noqa: S101
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1e1d683 and a9520af.

📒 Files selected for processing (2)
  • src/common/update.rs (1 hunks)
  • tests/openapi/test_batch_update_multivec.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/*.rs: Prefer explicit SomeType::from(x) over implicit x.into() in Rust code
Do not use transmute_from_u8, transmute_to_u8, transmute_from_u8_to_slice, transmute_from_u8_to_mut_slice, transmute_to_u8_slice in new code; use bytemuck or zerocopy instead

Files:

  • src/common/update.rs
**/src/**/*.rs

📄 CodeRabbit inference engine (.github/review-rules.md)

**/src/**/*.rs: Prefer exhaustive match arms over a catch-all _ arm to avoid missing new enum variants (except in tests/benchmarks or when provably safe)
Prefer explicit field ignoring with : _ over .. in struct patterns (except in tests/benchmarks or when provably safe)

Files:

  • src/common/update.rs
🧠 Learnings (1)
📓 Common learnings
Learnt from: generall
PR: qdrant/qdrant#7006
File: lib/collection/src/operations/point_ops.rs:501-528
Timestamp: 2025-08-10T18:31:56.855Z
Learning: In Qdrant, batch operations validate that `ids`, `vectors`, and `payloads` (if present) have matching lengths at the REST API level in `lib/api/src/rest/validate.rs` through the `Validate` trait implementation for `Batch`. This validation happens before data is converted to internal structures like `PointInsertOperationsInternal`, so methods operating on these internal structures can safely assume the lengths match.
🧬 Code graph analysis (2)
tests/openapi/test_batch_update_multivec.py (2)
tests/openapi/conftest.py (1)
  • collection_name (50-51)
tests/openapi/helpers/helpers.py (1)
  • request_with_validation (41-116)
src/common/update.rs (1)
lib/api/src/grpc/validate.rs (11)
  • validate (14-14)
  • validate (19-21)
  • validate (29-31)
  • validate (39-41)
  • validate (49-54)
  • validate (58-64)
  • validate (68-74)
  • validate (78-85)
  • validate (89-97)
  • validate (101-112)
  • validate (116-123)
🪛 Ruff (0.12.2)
tests/openapi/test_batch_update_multivec.py

65-65: Use of assert detected

(S101)


67-67: Use of assert detected

(S101)


68-68: Use of assert detected

(S101)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: integration-tests
  • GitHub Check: lint
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: e2e-tests
  • GitHub Check: test-consistency
  • GitHub Check: storage-compat-test
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests-consensus
🔇 Additional comments (1)
src/common/update.rs (1)

74-76: Verify REST batch endpoint uses UpdateOperations for request body
I couldn’t find any reference to UpdateOperations in the REST handler; ensure the batch update endpoint actually deserializes into this struct so the #[validate(nested)] on operations: Vec<UpdateOperation> takes effect.

Copy link
Member

@timvisee timvisee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the test!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
tests/openapi/test_batch_update_multivec.py (1)

36-64: Consider broadening negative cases (other clauses/ops) to ensure full batch coverage.

Add a couple of variants (e.g., invalid FieldCondition under should/must_not) or mix valid+invalid operations in the same batch to ensure the first failing element is reported and the whole batch is rejected. This complements the note that not all nested variants of Update may be validated deeply.

🧹 Nitpick comments (3)
tests/openapi/test_batch_update_multivec.py (3)

56-56: Clarify the inline comment to reflect why the condition is invalid.

The issue is that FieldCondition lacks any matcher; the payload key name “empty” is incidental.

Apply this diff:

-                                "key": "empty" # empty field conditions are forbidden
+                                "key": "empty"  # invalid FieldCondition: missing any matcher (e.g., match/range); 'key' alone is insufficient

65-68: Make assertion less brittle against wording changes in the top-level error wrapper.

The specific phrase “in JSON body” may drift; keep the semantic check but relax wording.

Apply this diff:

-    assert "Validation error in JSON body" in error
+    assert "validation error" in error.lower()

65-68: Ruff S101 on asserts in tests: silence or configure per-file ignore.

Asserts are idiomatic in pytest. Either ignore S101 for tests repo-wide or add a per-file override.

Add at the top of this test file:

# ruff: noqa: S101

Or in pyproject.toml:

[tool.ruff.lint.per-file-ignores]
"tests/**" = ["S101"]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a9520af and 8598d5a.

📒 Files selected for processing (1)
  • tests/openapi/test_batch_update_multivec.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: generall
PR: qdrant/qdrant#7006
File: lib/collection/src/operations/point_ops.rs:501-528
Timestamp: 2025-08-10T18:31:56.855Z
Learning: In Qdrant, batch operations validate that `ids`, `vectors`, and `payloads` (if present) have matching lengths at the REST API level in `lib/api/src/rest/validate.rs` through the `Validate` trait implementation for `Batch`. This validation happens before data is converted to internal structures like `PointInsertOperationsInternal`, so methods operating on these internal structures can safely assume the lengths match.
🧬 Code graph analysis (1)
tests/openapi/test_batch_update_multivec.py (1)
tests/openapi/helpers/helpers.py (1)
  • request_with_validation (41-116)
🪛 Ruff (0.12.2)
tests/openapi/test_batch_update_multivec.py

65-65: Use of assert detected

(S101)


67-67: Use of assert detected

(S101)


68-68: Use of assert detected

(S101)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: storage-compat-test
  • GitHub Check: test-shard-snapshot-api-s3-minio
  • GitHub Check: test-consensus-compose
  • GitHub Check: integration-tests
  • GitHub Check: e2e-tests
  • GitHub Check: test-consistency
  • GitHub Check: integration-tests-consensus
  • GitHub Check: rust-tests (ubuntu-latest)
  • GitHub Check: rust-tests (macos-latest)
  • GitHub Check: rust-tests (windows-latest)
  • GitHub Check: lint
  • GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
🔇 Additional comments (1)
tests/openapi/test_batch_update_multivec.py (1)

36-64: Good negative coverage for nested UpdateOperation validation (422 on malformed filter).

This directly guards the regression the PR addresses and aligns with API-level validation occurring before internal conversion (per prior learnings). Nice.

@agourlay agourlay merged commit f30e7f7 into dev Sep 5, 2025
16 checks passed
@agourlay agourlay deleted the validate-batch-update-operations branch September 5, 2025 16:25
timvisee pushed a commit that referenced this pull request Sep 29, 2025
* Fix missing REST API validation for point batch update

* use a valid vector
@timvisee timvisee mentioned this pull request Sep 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants