Conversation
📝 Walkthrough## Walkthrough
This change introduces support for Maximal Marginal Relevance (MMR) re-ranking in nearest neighbor search queries across the API, protobuf definitions, validation, REST and gRPC schemas, and internal query handling. New message types and fields for MMR are added to the API schemas and protobufs, including optional parameters for diversity (trade-off between relevance and diversity) and candidates_limit. The query planning and collection logic are updated to handle the new MMR-based search variant, including rescoring and result filtering with score thresholding integrated into the MMR algorithm. Documentation and comments are updated to use "Maximal" instead of "Maximum" Marginal Relevance. No existing fields or message definitions are removed; the changes are additive.
## Possibly related issues
- qdrant/qdrant-client#1017: This issue requests MMR support in the qdrant-client library; while this PR focuses on server-side MMR integration, it addresses complementary objectives related to MMR query handling.
## Possibly related PRs
- qdrant/qdrant#6786: Adds internal support for MMR scoring at the collection level with new internal protobuf messages and query handling, complementing this PR's API and query structure changes for MMR.
- qdrant/qdrant#6768: Implements the internal MMR algorithm and rescoring logic at the collection and shard levels, directly related to this PR’s API and query structure additions for exposing MMR functionality.
## Suggested reviewers
- timvisee
- generall📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (13)
🚧 Files skipped from review as they are similar to previous changes (12)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
🔇 Additional comments (7)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (12)
lib/collection/src/collection/mmr.rs (1)
237-249: Doc/function naming mismatch – consider aligning terminologyGreat to see the wording updated to “Maximal Marginal Relevance”.
However, the public function is still namedmaximum_marginal_relevance, which now contradicts the docs and the rest of the project (all proto/REST updates use maximal). Renaming the fn (plus its callers & tests) will avoid future confusion.-pub fn maximum_marginal_relevance( +pub fn maximal_marginal_relevance(If the rename is too noisy for this PR, at least add a
#[deprecated(note = "use maximal_marginal_relevance")]wrapper and introduce the correctly-named alias.src/common/inference/batch_processing.rs (1)
316-320: Unnecessary field initialisation
NearestQuery::mmrdefaults toNone, so the explicitmmr: Noneis redundant in this test construction.
Feel free to keep it for clarity though.tools/generate_grpc_docs.sh (1)
3-10: Shell script portability & temp-dir location
set -exwill dump every command into CI logs; if logs contain secrets (e.g., Docker creds) consider gating-xbehind a verbose flag.mktemp -d ./qdrant_docs.XXXXXXXXXXcreates the dir in the repo root. Parallel runs (e.g., CI matrix) could still collide. Using$TMPDIRis safer.- TEMPD=$(mktemp -d ./qdrant_docs.XXXXXXXXXX) + TEMPD=$(mktemp -d "${TMPDIR:-/tmp}/qdrant_docs.XXXXXXXXXX")src/common/inference/query_requests_rest.rs (1)
199-216: REST MMR conversion logic is correct but could be simplified.The logic correctly branches between
NearestWithMmrandNearestvariants based on the presence of themmrfield. However, the long module path on line 208 could be simplified.Consider adding an import alias to simplify the code:
use collection::operations::universal_query::collection_query::{ - CollectionPrefetch, CollectionQueryGroupsRequest, CollectionQueryRequest, Mmr, Query, + CollectionPrefetch, CollectionQueryGroupsRequest, CollectionQueryRequest, Mmr, NearestWithMmr, Query, VectorInputInternal, VectorQuery, };Then update line 208:
- Ok(Query::Vector(VectorQuery::NearestWithMmr( - collection::operations::universal_query::collection_query::NearestWithMmr { - nearest: vector, - mmr, - }, - ))) + Ok(Query::Vector(VectorQuery::NearestWithMmr(NearestWithMmr { + nearest: vector, + mmr, + })))docs/redoc/master/openapi.json (1)
14670-14677: Missing default & questionablenullableonlambda
lambdais documented as “default 0.5”, yet nodefaultfield is present.
Also, making the valuenullablecontradicts the statement that it “must be in the range [0, 1]”."lambda": { ... - "nullable": true + "default": 0.5 },If
nullis truly acceptable, adjust the docstring to say so and keep the constraint logic consistent.lib/api/src/grpc/proto/points.proto (1)
643-650: Clarify optionality & presence semantics ofmmrfieldThe top-level comment says the MMR block is optional, but the field is declared as a plain
Mmr mmr = 2;(nooptional).
While message-typed fields are implicitly optional in proto3, readers may read the comment and look for theoptionalkeyword.
Consider adding the keyword for consistency with the scalar fields directly below, e.g.- Mmr mmr = 2; + optional Mmr mmr = 2;This keeps the public schema self-documenting and avoids the “is it required?” question.
lib/api/src/grpc/qdrant.rs (1)
5597-5599: Fix inconsistent comment formatting.The comment formatting uses
/// /instead of the standard///format used elsewhere in the codebase.- /// / The lambda parameter for the MMR algorithm. - /// / Determines the balance between diversity and relevance. - /// / - /// / A higher value favors relevance (similarity to the query vector), while a lower value favors diversity. + /// The lambda parameter for the MMR algorithm. + /// Determines the balance between diversity and relevance. + /// + /// A higher value favors relevance (similarity to the query vector), while a lower value favors diversity.docs/grpc/docs.md (4)
194-202: Fix unordered-list indentation in TOCThe two newly-added lines are indented with four spaces, whereas the surrounding list items use two. This is what triggers the markdown-lint MD007 warnings.
- - [Mmr](#qdrant-Mmr) - - [NearestInputWithMmr](#qdrant-NearestInputWithMmr) + - [Mmr](#qdrant-Mmr) + - [NearestInputWithMmr](#qdrant-NearestInputWithMmr)
3390-3401: Table row is split → broken pipes & linter error (MD055)The
candidate_limitrow is rendered as two separate lines, so the table loses its trailing pipe and breaks:| candidate_limit | [uint32](#uint32) | optional | The maximum number of candidates to consider for re-ranking. If not specified, the `limit` value is used. |Join the description onto one line (or insert
<br>), then keep the closing|:-| candidate_limit | [uint32](#uint32) | optional | The maximum number of candidates to consider for re-ranking. -If not specified, the `limit` value is used. | +| candidate_limit | [uint32](#uint32) | optional | The maximum number of candidates to consider for re-ranking.<br/>If not specified, the `limit` value is used. |
3395-3399: Remove stray “/” characters in thelambdadescriptionThe slashes render literally in Markdown and confuse the sentence:
-...parameter for the MMR algorithm. / Determines the balance between diversity and relevance. / / A higher value... +...parameter for the MMR algorithm. Determines the balance between diversity and relevance. A higher value...
3514-3524: Minor wording tweak forNearestInputWithMmrdescriptionConsider “re-rank using the same vector” → “re-rank using that vector” to avoid ambiguity and keep wording tight.
No functional issues; purely editorial.
lib/collection/src/operations/universal_query/collection_query.rs (1)
129-139: Consider usingcopied()instead ofmap(|x| **x)for clarity.The implementation is correct, but you can make it slightly more idiomatic.
pub fn get_referenced_ids(&self) -> Vec<PointIdType> { match self { Self::Vector(vector_query) => vector_query .get_referenced_ids() .iter() - .map(|x| **x) + .copied() .collect(), Self::Fusion(_) | Self::OrderBy(_) | Self::Formula(_) | Self::Sample(_) => Vec::new(), } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
docs/grpc/docs.md(4 hunks)docs/redoc/master/openapi.json(1 hunks)lib/api/build.rs(1 hunks)lib/api/src/grpc/proto/points.proto(2 hunks)lib/api/src/grpc/proto/points_internal_service.proto(1 hunks)lib/api/src/grpc/qdrant.rs(3 hunks)lib/api/src/grpc/validate.rs(1 hunks)lib/api/src/rest/conversions.rs(1 hunks)lib/api/src/rest/schema.rs(2 hunks)lib/collection/src/collection/mmr.rs(1 hunks)lib/collection/src/collection/query.rs(1 hunks)lib/collection/src/common/fetch_vectors.rs(2 hunks)lib/collection/src/common/retrieve_request_trait.rs(1 hunks)lib/collection/src/operations/universal_query/collection_query.rs(10 hunks)lib/collection/src/operations/universal_query/planned_query.rs(4 hunks)lib/collection/src/operations/universal_query/shard_query.rs(3 hunks)lib/collection/src/shards/local_shard/query.rs(2 hunks)src/common/inference/batch_processing.rs(1 hunks)src/common/inference/batch_processing_grpc.rs(1 hunks)src/common/inference/query_requests_grpc.rs(2 hunks)src/common/inference/query_requests_rest.rs(3 hunks)tools/generate_grpc_docs.sh(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
lib/api/src/grpc/validate.rs (2)
Learnt from: generall
PR: qdrant/qdrant#6274
File: lib/segment/src/vector_storage/sparse/mmap_sparse_vector_storage.rs:233-233
Timestamp: 2025-03-29T17:43:28.174Z
Learning: In the qdrant codebase, `debug_assert!` is preferred over runtime checks that would panic in release builds to ensure production stability, even when validating conditions like vector sorting.
Learnt from: xzfc
PR: qdrant/qdrant#6245
File: lib/segment/src/index/hnsw_index/point_scorer.rs:116-121
Timestamp: 2025-04-22T23:17:41.734Z
Learning: The method `is_none_or` exists in the Qdrant codebase and compiles correctly, despite not being part of standard Rust's Option type. Code reviews should verify compilation issues before reporting them.
🧬 Code Graph Analysis (3)
lib/collection/src/common/retrieve_request_trait.rs (1)
lib/collection/src/operations/universal_query/collection_query.rs (2)
get_lookup_collection(426-428)get_lookup_collection(535-537)
src/common/inference/query_requests_grpc.rs (1)
src/common/inference/query_requests_rest.rs (1)
convert_vector_input_with_inferred(148-191)
lib/collection/src/operations/universal_query/collection_query.rs (1)
lib/collection/src/common/retrieve_request_trait.rs (1)
get_referenced_ids(135-139)
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md
194-194: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
195-195: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
196-196: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
197-197: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
198-198: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
199-199: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
200-200: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
201-201: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
202-202: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
3398-3398: Table pipe style
Expected: leading_and_trailing; Actual: leading_only; Missing trailing pipe
(MD055, table-pipe-style)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: test-consensus-compose
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-consistency
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: integration-tests
- GitHub Check: integration-tests-consensus
- GitHub Check: storage-compat-test
- GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: lint
🔇 Additional comments (38)
lib/api/src/grpc/proto/points_internal_service.proto (1)
275-283: 👍 Terminology fix is consistent with API docs
No further action required.lib/api/src/rest/conversions.rs (1)
146-150: Loss of future MMR data
QueryInterface::Nearestcan’t currently carry an MMR payload, so forcingmmr: Noneis fine.
If aNearestWithMmrvariant is added later, this arm will silently drop the data. Consider a separate enum variant or:QueryInterface::Nearest(nearest @ VectorInput { .., mmr }) => { Query::Nearest(NearestQuery { nearest, mmr }) }Not blocking, just a heads-up for forward compatibility.
lib/api/src/grpc/validate.rs (1)
342-342: LGTM! Validation pattern is consistent.The new match arm for
NearestWithMmrfollows the established validation pattern correctly.src/common/inference/batch_processing_grpc.rs (1)
165-171: LGTM! Proper handling of optional vector input.The implementation correctly handles the new
NearestWithMmrvariant by extracting the optionalnearestfield and processing it through the existing vector input collection logic. The use oftranspose()?properly handles the error propagation.lib/collection/src/common/retrieve_request_trait.rs (1)
108-115: LGTM! Refactoring improves efficiency.The simplification to directly clone
self.referenced_idsinstead of reconstructing them from the vector query is a good optimization. This suggests the referenced IDs are now pre-computed and stored, which is more efficient.lib/collection/src/shards/local_shard/query.rs (2)
427-427: LGTM! Correct terminology used.The change from "Maximum" to "Maximal" Marginal Relevance is correct terminology.
471-472: Confirm intentional change: MMR results useretainlike in collection/query.rsThe change in lib/collection/src/shards/local_shard/query.rs now mirrors the pattern in lib/collection/src/collection/query.rs, where:
- Fused results (which are already sorted) use
.take_while(|p| p.score >= score_threshold)- MMR results use
.retain(|p| p.score >= score_threshold)This ensures we only assume sorted input for the fused branch and apply a full‐scan filter for MMR rescoring. No further action needed.
lib/api/build.rs (1)
293-296: LGTM! Appropriate validation constraints for MMR parameters.The validation rules are well-designed:
lambdarange [0.0, 1.0] is correct for MMR's relevance/diversity trade-off parametercandidate_limitmax of 16,384 provides reasonable resource protection- Empty constraints for complex fields correctly defer to nested validation
lib/collection/src/operations/universal_query/shard_query.rs (2)
69-69: Documentation terminology updated correctly.Good change to use "Maximal" instead of "Maximum" for consistency with standard MMR terminology.
123-124: MMR intermediate-results logic – validated
ReturningfalseforScoringQuery::Mmr(_)inneeds_intermediate_results()is correct. The plan incollection_query.rs(lines 387–392) first gathers the nearest‐neighbor lists across all shards, then performs a singlemmr_rescoreon the aggregated results. No intermediate‐results merging per shard is needed.lib/api/src/rest/schema.rs (2)
478-484: MMR integration into NearestQuery looks good.The optional
mmrfield maintains backward compatibility while enabling MMR reranking functionality. The documentation clearly explains the purpose.
537-556: Mmr struct validation is well-designed.The validation constraints are appropriate:
lambdarange [0,1] matches MMR algorithm requirementscandidate_limitmaximum of 16,384 prevents expensive queries- Optional fields with defaults provide good UX
The documentation clearly explains the parameters' effects on relevance vs diversity trade-off.
src/common/inference/query_requests_grpc.rs (2)
8-9: Import additions are correct.The new imports
MmrandNearestWithMmrare properly added to support the MMR functionality.
295-311: gRPC MMR conversion implementation is correct.The implementation properly:
- Validates required fields with clear error messages
- Reuses existing vector conversion logic
- Safely converts
candidate_limitfromu32tousize- Creates the appropriate
NearestWithMmrinternal variantThe error handling is consistent with other variants in the same function.
src/common/inference/query_requests_rest.rs (2)
5-5: Import addition is correct.Adding the
Mmrimport to support the new MMR functionality.
449-449: Test update is appropriate.The test correctly includes the new
mmr: Nonefield to match the updatedNearestQuerystruct schema.lib/collection/src/common/fetch_vectors.rs (3)
411-413: LGTM: Function signature simplifiedThe removal of lifetime annotations from the function signature is a good simplification that makes the function easier to work with by using owned data instead of references.
417-421: LGTM: Improved query handling generalizationThe change from specifically checking for
Query::Vectorvariant to using the genericQuery::get_referenced_idsmethod is a good design improvement that:
- Makes the code more extensible to support new query variants (like MMR)
- Follows the principle of polymorphism by using a common interface
- Reduces coupling between this function and specific query types
425-425: LGTM: Consistent with owned data patternThe change to use owned
referenced_idsinCollectionQueryResolveRequestis consistent with the function signature changes and aligns with the overall refactoring to use owned data instead of references.lib/collection/src/collection/query.rs (1)
408-412: LGTM: Efficient in-place filtering optimizationThe refactoring from conditional vector reconstruction to using
retainis a good performance optimization that:
- Avoids unnecessary memory allocations by filtering in-place
- Maintains the same filtering semantics (removing points below score threshold)
- Follows idiomatic Rust patterns for efficient vector filtering
- Improves code readability by eliminating conditional logic
This change aligns well with the overall MMR integration while optimizing performance.
lib/api/src/grpc/proto/points.proto (2)
652-664: Fill the missing tag #1 and reserve it to avoid future collisionsThe
Mmrmessage jumps straight to tags 2 and 3, leaving tag 1 unused and un-reserved:optional float lambda = 2; optional uint32 candidate_limit = 3;Even though Protobuf allows gaps, forgetting to reserve the gap opens a foot-gun: someone can later reuse
=1, changing the wire format.
Add an explicit reservation (or re-number before the message ships):message Mmr { - optional float lambda = 2; - optional uint32 candidate_limit = 3; + // tag 1 intentionally unused – reserved for backward compatibility + reserved 1; + + optional float lambda = 2; // 0.0–1.0 + optional uint32 candidate_limit = 3; // defaults to `limit` }Would you like a follow-up patch across all generated
.rscode to reflect the new reservation?
676-677: Confirm back-compatibility of newQuery.variantvalue = 9Adding
nearest_with_mmr = 9looks good (no collision).
Please double-check that every Protobuf consumer (REST ↔ gRPC bridges, SDKs, etc.) handles the unknown variant gracefully to avoid silent failures when older clients parse the enum.lib/api/src/grpc/qdrant.rs (5)
5580-5590: LGTM! Well-structured MMR input definition.The
NearestInputWithMmrstruct is properly designed with appropriate validation annotations and clear documentation. The combination of nearest vector input with optional MMR parameters provides the expected flexibility for the feature.
5601-5609: LGTM! Appropriate validation constraints for MMR parameters.The validation ranges are well-chosen:
- Lambda range [0.0, 1.0] correctly constrains the relevance-diversity trade-off parameter
- Candidate limit max of 16,384 provides reasonable upper bounds while preventing excessive resource usage
5650-5652: LGTM! Proper variant addition with consistent naming.The new
NearestWithMmrvariant is correctly added with:
- Proper protobuf tag (9)
- Consistent naming convention
- Correct reference to the
NearestInputWithMmrstruct- Accurate documentation using "Maximal Marginal Relevance"
10162-10162: LGTM! Consistent terminology update.The comment correctly uses "Maximal Marginal Relevance" instead of "Maximum Marginal Relevance", aligning with the PR's objective to standardize the terminology.
5616-5616: Protobuf tag consistency confirmed for NearestWithMmrThe
#[prost(oneof…, tags = "1, 2, 3, 4, 5, 6, 7, 8, 9")]list correctly includes tag 9, and theNearestWithMmrvariant is defined with#[prost(message, tag = "9")]. No changes are needed.docs/grpc/docs.md (1)
4004-4005: Good addition – new query variant clearly documentedThe new field is correctly marked as optional and its purpose is clear. 👍
lib/collection/src/operations/universal_query/collection_query.rs (5)
26-26: LGTM!The default lambda value of 0.5 is a reasonable choice for MMR, providing a balanced trade-off between relevance and diversity.
61-63: Good refactoring for better ownership semantics.Storing owned
referenced_idsinstead of query references improves memory management and avoids potential lifetime issues.
189-192: Good API design with optional fields.Making
lambdaandcandidate_limitoptional allows for sensible defaults while giving users flexibility to override when needed.
387-392: Sensible default values for MMR parameters.Using
request_limitas the defaultcandidate_limitensures sufficient candidates for the final result, while the lambda default of 0.5 provides balanced relevance/diversity trade-off.
509-515: Clean refactoring using the new helper method.Good use of
get_referenced_ids()method, making the code more maintainable and consistent.lib/collection/src/operations/universal_query/planned_query.rs (5)
89-94: Good practice: Early validation with clear error message.Moving the depth check to the beginning follows the fail-fast principle and provides immediate feedback for invalid inputs.
110-119: Correct handling ofwith_vectorfor MMR queries.The code properly merges the
with_vectorflag with MMR'susingfield to ensure vectors are included when needed.
121-147: Excellent refactoring for improved readability.Splitting the logic into
root_plan_without_prefetchesandroot_plan_with_prefetchesmakes the code much clearer and easier to maintain.
202-219: Correct MMR implementation pattern.The code properly converts MMR queries to nearest neighbor searches for candidate selection, which aligns with the standard MMR algorithm flow. The comment helpfully explains that final MMR computation happens at the collection level.
306-416: Well-structured helper function with comprehensive query type handling.The
leaf_source_from_scoring_queryfunction provides a clean abstraction for creating leaf sources from different query types. The MMR handling correctly creates a nearest neighbor search with the appropriate candidate limit.
| "mmr": { | ||
| "description": "Perform MMR (Maximal Marginal Relevance) reranking after search, using the same vector in this query to calculate relevance.", | ||
| "anyOf": [ | ||
| { | ||
| "$ref": "#/components/schemas/Mmr" | ||
| }, | ||
| { | ||
| "nullable": true | ||
| } | ||
| ] | ||
| } |
There was a problem hiding this comment.
mmr definition risks OpenAPI-schema validation errors
Using
"anyOf": [
{ "$ref": "#/components/schemas/Mmr" },
{ "nullable": true }
]is non-standard:
{ "nullable": true }is not a valid schema on its own – it lacks atype.- The typical pattern (OAS 3.0.x) is either
or (OAS 3.1)
mmr: $ref: '#/components/schemas/Mmr' nullable: true # ← sibling of $ref
oneOf: - $ref: '#/components/schemas/Mmr' - type: 'null'
- Several generators (e.g.
openapi-generator,swagger-codegen) will fail on the current construct.
Please switch to one of the two canonical forms above, depending on the targeted OAS version.
🤖 Prompt for AI Agents
In docs/redoc/master/openapi.json around lines 14653 to 14663, the "mmr" schema
uses an invalid "anyOf" with a nullable object lacking a type, which risks
OpenAPI validation errors. To fix this, replace the "anyOf" array with either a
direct $ref and a sibling "nullable: true" property if targeting OpenAPI 3.0.x,
or use "oneOf" with the $ref and a "type: 'null'" schema if targeting OpenAPI
3.1. This change aligns with standard OpenAPI schema patterns and ensures
compatibility with code generators.
docs/redoc/master/openapi.json
Outdated
| "candidate_limit": { | ||
| "description": "The maximum number of candidates to consider for re-ranking.\n\nIf not specified, the `limit` value is used.", | ||
| "type": "integer", | ||
| "format": "uint", | ||
| "maximum": 16384, | ||
| "minimum": 0, |
There was a problem hiding this comment.
format: "uint" is not part of the OpenAPI spec
OpenAPI officially recognises int32, int64, float, and double.
A custom uint format will confuse tooling and code generators.
- "format": "uint",
+ "format": "int64", # or omit `format` entirelyIf unsigned semantics are important, document it in the description and enforce it with minimum: 0.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In docs/redoc/master/openapi.json around lines 14678 to 14683, the
"candidate_limit" property uses a non-standard format "uint" which is not
recognized by OpenAPI and may cause issues with tooling. Remove the "format":
"uint" line entirely, keep the "type": "integer" and "minimum": 0 to enforce
unsigned semantics, and update the description to clearly state that the value
must be non-negative.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
docs/grpc/docs.md (2)
194-202: Fix unordered-list indentation for new TOC entriesThe inserted
MmrandNearestInputWithMmrbullet points are indented 4 spaces instead of the 2-space level used by the rest of the list, trippingmarkdownlint(MD007).- - [Mmr](#qdrant-Mmr) - - [NearestInputWithMmr](#qdrant-NearestInputWithMmr) + - [Mmr](#qdrant-Mmr) + - [NearestInputWithMmr](#qdrant-NearestInputWithMmr)
3519-3528: Clarify optionality & semantics ofmmrfieldThe prose states “Perform MMR …”, which reads like a mandatory action, yet the proto defines
mmras an optional message (sonearest_with_mmrwithout anmmrblock would still compile). Suggest wording:-| mmr | [Mmr](#qdrant-Mmr) | | Perform MMR (Maximal Marginal Relevance) reranking after search, using the same vector in this query to calculate relevance. | +| mmr | [Mmr](#qdrant-Mmr) | optional | If supplied, reranks the `nearest` results with Maximal Marginal Relevance using this query vector for relevance computation. |This also aligns the description column with other optional fields.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
docs/grpc/docs.md(4 hunks)docs/redoc/master/openapi.json(1 hunks)lib/api/build.rs(1 hunks)lib/api/src/grpc/proto/points.proto(2 hunks)lib/api/src/grpc/qdrant.rs(3 hunks)lib/api/src/rest/schema.rs(2 hunks)lib/collection/src/operations/universal_query/collection_query.rs(10 hunks)src/common/inference/query_requests_grpc.rs(2 hunks)src/common/inference/query_requests_rest.rs(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
- lib/api/build.rs
- src/common/inference/query_requests_grpc.rs
- src/common/inference/query_requests_rest.rs
- lib/api/src/rest/schema.rs
- lib/api/src/grpc/proto/points.proto
- lib/api/src/grpc/qdrant.rs
- lib/collection/src/operations/universal_query/collection_query.rs
- docs/redoc/master/openapi.json
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md
194-194: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
195-195: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
196-196: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
197-197: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
198-198: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
199-199: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
200-200: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
201-201: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
202-202: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
3397-3397: Table pipe style
Expected: leading_and_trailing; Actual: leading_only; Missing trailing pipe
(MD055, table-pipe-style)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
- GitHub Check: test-consistency
- GitHub Check: integration-tests-consensus
- GitHub Check: integration-tests
- GitHub Check: storage-compat-test
- GitHub Check: lint
🔇 Additional comments (1)
docs/grpc/docs.md (1)
4008-4009: Verify mutual-exclusion logic among query variants
nearest_with_mmrco-exists alongsidenearest,recommend, etc. At the API level these search variants appear to be mutually exclusive, but the proto does not enforceoneof. Ensure the backend rejects payloads that set bothnearestandnearest_with_mmr, or—better—move the related fields into aoneofto obtain compile-time safety.Please confirm the planner conversion added in the Rust code returns a clear validation error for such mixed queries.
timvisee
left a comment
There was a problem hiding this comment.
LGTM!
Can you also add a simple integration test?
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/openapi/test_query.py (1)
485-544: Consider adding more comprehensive MMR test coverageThe current test validates basic MMR functionality but could benefit from additional test cases:
- Testing different diversity values (0.0, 1.0) to validate parameter boundaries
- Testing different candidate_limit values to ensure the parameter works correctly
- Testing error cases (invalid diversity values, negative candidate_limit)
- Testing with payload/vector retrieval to ensure MMR works with full response data
Consider adding additional test cases:
def test_mmr_parameters_validation(collection_name): # Test boundary values for diversity for diversity in [0.0, 1.0]: response = request_with_validation( api="/collections/{collection_name}/points/query", method="POST", path_params={"collection_name": collection_name}, body={ "query": { "nearest": [0.35, 0.08, 0.11, 0.47], "mmr": {"diversity": diversity} }, }, ) assert response.ok, f"Diversity {diversity} should be valid" # Test different candidate_limit values for candidate_limit in [5, 20, 100]: response = request_with_validation( api="/collections/{collection_name}/points/query", method="POST", path_params={"collection_name": collection_name}, body={ "query": { "nearest": [0.35, 0.08, 0.11, 0.47], "mmr": {"candidate_limit": candidate_limit} }, }, ) assert response.ok, f"Candidate limit {candidate_limit} should be valid"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
lib/collection/src/collection/mmr.rs(14 hunks)lib/collection/src/collection/query.rs(2 hunks)lib/collection/src/shards/local_shard/query.rs(2 hunks)tests/openapi/test_query.py(8 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- lib/collection/src/collection/query.rs
- lib/collection/src/shards/local_shard/query.rs
- lib/collection/src/collection/mmr.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/openapi/test_query.py (2)
tests/openapi/helpers/helpers.py (3)
reciprocal_rank_fusion(120-145)distribution_based_score_fusion(147-174)request_with_validation(41-116)tests/openapi/test_query_full.py (1)
get_id(996-997)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-consistency
- GitHub Check: integration-tests-consensus
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-consensus-compose
- GitHub Check: integration-tests
- GitHub Check: storage-compat-test
- GitHub Check: lint
🔇 Additional comments (3)
tests/openapi/test_query.py (3)
397-397: LGTM: Whitespace cleanupThe whitespace changes improve code readability and consistency.
Also applies to: 409-409, 411-411, 426-426, 429-429, 435-435, 449-449, 461-461, 463-463, 478-478, 484-484
522-526: LGTM: Payload and vector validationGood validation that ensures the response structure is correct when payload and vector data are not requested.
527-544: LGTM: Default parameters validationGood test coverage for ensuring default MMR parameters work correctly and produce consistent results.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
docs/grpc/docs.md (1)
3390-3405: Table row forMmrfields is split across multiple lines – breaks GitHub rendering
diversityandcandidates_limitdescriptions are wrapped onto new lines, so the parser thinks each continuation line is a brand-new table row → the whole table is malformed (MD055).-| diversity | [float](#float) | optional | Tunable parameter for the MMR algorithm. Determines the balance between diversity and relevance. - -A higher value favors diversity (dissimilarity to selected results), while a lower value favors relevance (similarity to the query vector). - -Must be in the range [0, 1]. Default value is 0.5. | -| candidates_limit | [uint32](#uint32) | optional | The maximum number of candidates to consider for re-ranking. - -If not specified, the `limit` value is used. | +| diversity | [float](#float) | optional | Tunable parameter ∈ [0, 1] (default 0.5). Higher → diversity, lower → relevance. | +| candidates_limit | [uint32](#uint32) | optional | Max neighbours considered before re-ranking (defaults to outer `limit`). |
🧹 Nitpick comments (2)
docs/grpc/docs.md (2)
194-202: Fix list-indentation to silencemarkdownlint(MD007)
The two newly-added bullet points are indented 4 spaces instead of the expected 2, tripping the linter.- - [Mmr](#qdrant-Mmr) - - [NearestInputWithMmr](#qdrant-NearestInputWithMmr) + - [Mmr](#qdrant-Mmr) + - [NearestInputWithMmr](#qdrant-NearestInputWithMmr)
3524-3528: Markmmrfield asoptionalfor consistency
InNearestInputWithMmr,mmris in fact optional in the protobuf – document it as such so the table stays truthful.-| mmr | [Mmr](#qdrant-Mmr) | | Perform MMR (Maximal Marginal Relevance) reranking after search, using the same vector in this query to calculate relevance. | +| mmr | [Mmr](#qdrant-Mmr) | optional | Perform MMR (Maximal Marginal Relevance) re-ranking after the nearest-neighbour search. |
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
docs/grpc/docs.md(4 hunks)docs/redoc/master/openapi.json(1 hunks)lib/api/src/grpc/proto/points.proto(2 hunks)lib/api/src/grpc/qdrant.rs(3 hunks)lib/api/src/rest/schema.rs(2 hunks)src/common/inference/query_requests_grpc.rs(2 hunks)src/common/inference/query_requests_rest.rs(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- src/common/inference/query_requests_rest.rs
- src/common/inference/query_requests_grpc.rs
- lib/api/src/grpc/proto/points.proto
- lib/api/src/grpc/qdrant.rs
- lib/api/src/rest/schema.rs
- docs/redoc/master/openapi.json
🧰 Additional context used
🪛 markdownlint-cli2 (0.17.2)
docs/grpc/docs.md
194-194: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
195-195: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
196-196: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
197-197: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
198-198: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
199-199: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
200-200: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
201-201: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
202-202: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
3397-3397: Table pipe style
Expected: leading_and_trailing; Actual: leading_only; Missing trailing pipe
(MD055, table-pipe-style)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: test-shard-snapshot-api-s3-minio
- GitHub Check: test-snapshot-operations-s3-minio
- GitHub Check: test-low-resources
- GitHub Check: Basic TLS/HTTPS tests
- GitHub Check: test-consistency
- GitHub Check: test-consensus-compose
- GitHub Check: integration-tests
- GitHub Check: rust-tests (macos-latest)
- GitHub Check: integration-tests-consensus
- GitHub Check: rust-tests (ubuntu-latest)
- GitHub Check: rust-tests (windows-latest)
- GitHub Check: lint
- GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
- GitHub Check: storage-compat-test
🔇 Additional comments (1)
docs/grpc/docs.md (1)
4008-4009: LGTM – newnearest_with_mmrquery variant is documented clearly
Field is named consistently with the message definition and has a concise description. No action required.
* expose mmr in rest and grpc * fix referenced vectors in mmr queries * rename to MmrInput in grpc for consistency * mmr is now an optional parameter of nearest query * fix score_threshold at local shard * fix conversion of rest -> collection * fix and refactor planned query * nits * rename `lambda` to `diversity` in interface * handle score_threshold within mmr calculation * add openapi test * fix candidate limit at collection level * candidate_limit -> candidates_limit * finish rename to candidates_limit --------- Co-authored-by: generall <andrey@vasnetsov.com>
Builds on top of #6786
Exposes MMR reranking as a query variant in REST and gRPC interfaces:
Looks like this:
This PR also includes a very needed refactor to the way we convert from
ShardQueryRequesttoPlannedQuery. This was done here since I was already fixing a bug introduced during latest MMR interface refactoring 🐛 🔨 .