[MMR] resolve at collection level by coszio · Pull Request #6786 · qdrant/qdrant

coszio · 2025-07-01T21:00:47Z

Builds on top of #6768

Adds MMR as a scoring query in ShardQueryRequest and CollectionQueryRequest.
Plans the query to include the MMR vector if it is MMR is the top query, so that it gets resolved at collection level without re-fetching
Handles MMR at collection level
- MMR vector gets removed afterwards if it wasn't requested

MMR is still not exposed at this point.

lib/collection/src/operations/universal_query/shard_query.rs

coderabbitai · 2025-07-07T20:58:21Z

📝 Walkthrough

Walkthrough

This change introduces comprehensive support for Maximum Marginal Relevance (MMR) scoring across the codebase. It adds a new MmrInternal message and related fields to the gRPC protocol, updates internal query representations to include MMR parameters, and extends query processing pipelines to handle MMR queries. The collection and shard query logic is updated to support asynchronous MMR result merging, rescoring, and candidate limit handling. New query variants and structures are introduced for MMR in the universal query modules, with corresponding updates to access checks and vector extraction utilities. Test cases are updated to include the new candidate limit parameter for MMR queries.

Possibly related PRs

[MMR] MMR internals #6768: Implements the core internal MMR algorithm and utilities, which the current PR integrates and builds upon to provide system-wide MMR support.

Suggested reviewers

timvisee
generall

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

lib/api/src/grpc/proto/points_internal_service.proto (1)

267-271: Consider widening candidate_limit to uint64 and document parameter boundaries
limit fields elsewhere in the proto (CoreSearchPoints.limit, QueryShardPoints.limit, …) are uint64. Using uint32 here silently caps the upper bound to 4 294 967 295 and introduces an inconsistent public contract. Unless there is a hard implementation cap, switch to uint64 for symmetry, or add an explicit comment clarifying the intentional 32-bit restriction.
Also, add a brief note that lambda must be in [0.0, 1.0] to avoid undefined behaviour.

lib/api/src/grpc/qdrant.rs (1)

10058-10067: LGTM! Well-structured protobuf message for MMR parameters.

The MmrInternal struct is properly defined with appropriate field types and protobuf tags. The optional vector field, lambda parameter for diversity/relevance trade-off, and candidate_limit for performance tuning are all sensible for MMR implementation.

Consider adding validation for the lambda parameter (typically should be in [0.0, 1.0] range) and candidate_limit (should be > 0) at the application layer if not already handled elsewhere.
lib/collection/src/operations/universal_query/planned_query.rs (1)
87-87: Enhance the TODO comment to be more specific.

The current TODO is too vague. Consider being more specific about what needs simplification.
-    // ToDo: This function needs serious simplification: split and refactor
+    // TODO: This function is complex due to multiple query types and prefetch handling. Consider:
+    // 1. Extract query type handling into separate methods
+    // 2. Separate prefetch logic from main query logic
+    // 3. Create a query builder pattern for different query types

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 10de8cc and 94248b1.

📒 Files selected for processing (12)

lib/api/src/grpc/proto/points_internal_service.proto (2 hunks)
lib/api/src/grpc/qdrant.rs (3 hunks)
lib/collection/src/collection/mmr.rs (7 hunks)
lib/collection/src/collection/query.rs (5 hunks)
lib/collection/src/operations/query_enum.rs (1 hunks)
lib/collection/src/operations/universal_query/collection_query.rs (8 hunks)
lib/collection/src/operations/universal_query/planned_query.rs (6 hunks)
lib/collection/src/operations/universal_query/shard_query.rs (7 hunks)
lib/collection/src/shards/local_shard/query.rs (1 hunks)
lib/segment/src/data_types/vectors.rs (1 hunks)
lib/segment/src/types.rs (2 hunks)
lib/storage/src/rbac/ops_checks.rs (2 hunks)

🧰 Additional context used

🧠 Learnings (4)

lib/segment/src/types.rs (1)

undefined

<retrieved_learning>
Learnt from: timvisee
PR: #6644
File: lib/gridstore/src/blob.rs:41-57
Timestamp: 2025-06-06T07:52:38.478Z
Learning: In the Blob trait implementations for gridstore (lib/gridstore/src/blob.rs), the maintainer prefers using assert! for input validation over Result-based error handling, as they don't expect invalid inputs in real scenarios and changing to Result would require updating the entire trait interface.
</retrieved_learning>

lib/collection/src/operations/universal_query/shard_query.rs (5)

Learnt from: xzfc
PR: qdrant/qdrant#6501
File: lib/segment/src/index/hnsw_index/links_container.rs:158-158
Timestamp: 2025-05-08T16:39:07.893Z
Learning: In the Qdrant codebase, allocation failures are generally not handled (allowed to panic) except in very few specific cases. Most `unwrap()` calls on allocation operations are intentional.

Learnt from: timvisee
PR: qdrant/qdrant#6546
File: lib/segment/src/index/field_index/full_text_index/text_index.rs:497-518
Timestamp: 2025-05-19T09:25:54.285Z
Learning: In the Qdrant project, using `.expect()` and `.unwrap()` is considered acceptable in test code since test failures should be loud and obvious, and proper error propagation adds unnecessary complexity to test code.

Learnt from: timvisee
PR: qdrant/qdrant#6644
File: lib/gridstore/src/blob.rs:41-57
Timestamp: 2025-06-06T07:52:38.478Z
Learning: In the Blob trait implementations for gridstore (lib/gridstore/src/blob.rs), the maintainer prefers using assert! for input validation over Result-based error handling, as they don't expect invalid inputs in real scenarios and changing to Result would require updating the entire trait interface.

Learnt from: IvanPleshkov
PR: qdrant/qdrant#6663
File: lib/quantization/src/encoded_vectors_binary.rs:270-276
Timestamp: 2025-06-18T09:23:14.977Z
Learning: Pattern matching on `&Option<T>` with `Some(x)` is valid and idiomatic in Rust. The bound variable `x` will have type `&T` and can be used to access fields through automatic dereferencing. No need to use `.as_ref()` in such cases.

Learnt from: generall
PR: qdrant/qdrant#6274
File: lib/segment/src/vector_storage/sparse/mmap_sparse_vector_storage.rs:233-233
Timestamp: 2025-03-29T17:43:28.174Z
Learning: In the qdrant codebase, `debug_assert!` is preferred over runtime checks that would panic in release builds to ensure production stability, even when validating conditions like vector sorting.

lib/collection/src/operations/universal_query/planned_query.rs (1)

Learnt from: coszio
PR: qdrant/qdrant#6565
File: lib/posting_list/src/iterator.rs:0-0
Timestamp: 2025-05-26T14:50:13.183Z
Learning: The `advance_until_greater_or_equal` method in `PostingIterator` in `lib/posting_list/src/iterator.rs` is intentionally designed with different semantics than `next()` and can output the same value multiple times. This is intended behavior for a search method that finds the first element with ID >= target_id.

lib/collection/src/collection/query.rs (2)

Learnt from: xzfc
PR: qdrant/qdrant#6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In the AsyncRawScorerImpl implementation, the unwrap() call on read_vectors_async results is intentional, with an explanatory comment noting that this experimental feature is meant to crash rather than silently fall back to synchronous implementation.

Learnt from: xzfc
PR: qdrant/qdrant#6245
File: lib/segment/src/vector_storage/async_raw_scorer.rs:48-56
Timestamp: 2025-04-22T23:19:51.232Z
Learning: In AsyncRawScorerImpl, the unwrap() call after read_vectors_async is intentional. The io_uring feature is experimental, and the code is designed to panic rather than silently fall back to a synchronous implementation if it fails, directing users to use the default IO implementation instead.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)

GitHub Check: storage-compat-test
GitHub Check: lint
GitHub Check: test-shard-snapshot-api-s3-minio
GitHub Check: test-low-resources
GitHub Check: test-consistency
GitHub Check: test-snapshot-operations-s3-minio
GitHub Check: test-consensus-compose
GitHub Check: rust-tests (macos-latest)
GitHub Check: Basic TLS/HTTPS tests
GitHub Check: rust-tests (ubuntu-latest)
GitHub Check: rust-tests-no-rocksdb (ubuntu-latest)
GitHub Check: rust-tests (windows-latest)
GitHub Check: integration-tests
GitHub Check: integration-tests-consensus

🔇 Additional comments (21)

lib/collection/src/operations/query_enum.rs (1)

69-69: Good documentation improvement!

The added documentation comment clearly explains the purpose of the QueryEnum and will help developers understand its role in segment-level vector queries.

lib/storage/src/rbac/ops_checks.rs (2)

17-18: Correct import additions for MMR support.

The imports properly bring in the necessary types for handling the new NearestWithMmr variant.

154-156: Proper access control implementation for MMR queries.

The implementation correctly handles the new NearestWithMmr variant by:

Extracting the underlying nearest query for validation

Ignoring MMR parameters (which don't affect access control)

Following the same pattern as other query types

This ensures MMR queries are subject to the same access control rules as regular nearest neighbor queries.

lib/collection/src/collection/mmr.rs (1)

512-512: Consistent test updates for new candidate_limit parameter.

All test cases have been properly updated to include the candidate_limit: 100 parameter in MmrInternal struct initializations. The value 100 is appropriate for test scenarios and maintains consistency across all test functions.

Also applies to: 556-556, 614-614, 655-655, 694-694, 757-757, 804-804

lib/segment/src/types.rs (2)

17-17: LGTM: Import addition supports the new merge functionality.

The itertools::Itertools import is appropriately added to support the unique() method used in the new WithVector::merge implementation.

3004-3023: Well-implemented additive merge logic for WithVector enum.

The merge method correctly implements additive merging semantics:

Boolean precedence: Bool(true) takes precedence over any other variant, which is correct for "include all vectors" semantics

Selector combination: When merging two selectors, it properly chains them, removes duplicates with unique(), and collects into a new vector

Mixed cases: Appropriately handles mixed scenarios by preferring selectors over Bool(false)

Complete coverage: All possible enum variant combinations are handled

The implementation aligns well with the MMR functionality described in the AI summary, where vector selection flags from different query stages need to be merged during query planning.

lib/api/src/grpc/proto/points_internal_service.proto (1)

275-282: Verify downstream handling of the new mmr score variant
Adding MmrInternal mmr = 6 extends the oneof score. Double-check that every match arm / match / switch on score in query-planning, shard execution, and collection merging now covers the Mmr case to avoid unreachable-branch panics.

lib/api/src/grpc/qdrant.rs (2)

10100-10101: Proper extension of protobuf oneof tags.

The addition of tag "6" to the existing oneof tags is correctly implemented, maintaining backward compatibility while extending the query scoring options.

10124-10126: MMR scoring variant correctly integrated.

The new Mmr variant is properly added to the Score enum with the correct tag (6) and references the MmrInternal struct appropriately. The integration follows the established pattern of other scoring variants.

lib/segment/src/data_types/vectors.rs (1)

524-542: No changes needed: Single/MultiDense only apply to the default vector name

Verified across the codebase that VectorStructInternal::Single and MultiDense are only ever used with DEFAULT_VECTOR_NAME, so consuming the entire Option and returning None for non-default names cannot occur in practice. This behavior is consistent with other usages (e.g., in point_ops.rs and operations/types.rs), and no preservation logic is required.

lib/collection/src/shards/local_shard/query.rs (2)

386-398: LGTM!

The MMR rescoring integration is clean and follows the existing pattern for other scoring types.

428-477: Well-structured MMR rescoring implementation.

The method properly handles timeout management, vector retrieval, and score threshold filtering. Good implementation!

lib/collection/src/operations/universal_query/planned_query.rs (2)

117-148: Excellent MMR query planning implementation.

The approach of converting MMR queries to nearest neighbor queries at the shard level while deferring actual MMR calculation to the collection level is well-designed. The vector merging logic ensures proper vector retrieval.

398-421: Consistent MMR handling in prefetch recursion.

Good to see the same MMR handling pattern applied here as in the main query planning logic.

lib/collection/src/operations/universal_query/shard_query.rs (3)

69-80: Well-documented MMR configuration struct.

The MmrInternal struct is properly designed with clear documentation for each field. The lambda parameter documentation is particularly helpful in explaining the diversity vs relevance trade-off.

123-126: Correct identification of MMR needing intermediate results.

MMR indeed requires intermediate results from all shards to properly balance relevance and diversity across the entire result set.

596-611: Proper gRPC to internal MMR conversion.

The conversion correctly validates the required vector field and properly converts all MMR parameters.

lib/collection/src/operations/universal_query/collection_query.rs (2)

168-179: Clean MMR query structure design.

The separation of the nearest vector query and MMR parameters into distinct structs provides good modularity and clarity.

363-376: Correct MMR query conversion.

The conversion properly extracts all MMR parameters and creates the appropriate ScoringQuery::Mmr variant.

lib/collection/src/collection/query.rs (2)

11-11: LGTM! Necessary imports for MMR functionality.

The imports for VectorStructInternal and mmr_from_points_with_vector are required for the new MMR scoring implementation.

Also applies to: 18-18

334-431: Well-structured refactoring to support async MMR processing.

The refactoring properly transforms the method to async and cleanly integrates MMR support. The implementation correctly:

Handles vector stripping based on the with_vector configuration

Applies score thresholds appropriately

Defers offset/limit application until after fusion/MMR processing

Note: The score threshold handling differs between fusion (using take_while) and MMR (using filter). This appears intentional since fusion results are pre-sorted by score while MMR results may not maintain score ordering.

* add mmr as a ScoringQuery * add mmr to CollectionQueryRequest * handle mmr after collecting from shards * use unwrap_or_else * include mmr as a vector query * ordering for MMR is None * fix score threshold * fmt + todo comment * remove unused function --------- Co-authored-by: generall <andrey@vasnetsov.com>

coszio requested review from generall and timvisee July 1, 2025 21:00

JojiiOfficial reviewed Jul 2, 2025

View reviewed changes

lib/collection/src/operations/universal_query/shard_query.rs Outdated Show resolved Hide resolved

coszio force-pushed the mmr-in-collection branch from 24268da to 9920953 Compare July 2, 2025 18:44

coszio mentioned this pull request Jul 2, 2025

[MMR] expose as query variant in rest and grpc #6797

Merged

coszio requested a review from JojiiOfficial July 3, 2025 13:55

coszio added this to the MMR milestone Jul 3, 2025

coszio force-pushed the mmr branch from f6f61ab to 1e88cfa Compare July 4, 2025 19:28

coszio force-pushed the mmr-in-collection branch from 941c1fa to 1c2145a Compare July 4, 2025 21:31

generall approved these changes Jul 7, 2025

View reviewed changes

Base automatically changed from mmr to dev July 7, 2025 20:51

coszio and others added 9 commits July 7, 2025 16:53

add mmr as a ScoringQuery

d5b82be

add mmr to CollectionQueryRequest

a5ae307

handle mmr after collecting from shards

d47f80a

use unwrap_or_else

e167e00

include mmr as a vector query

45e6e7d

ordering for MMR is None

2de608e

fix score threshold

2447ed7

fmt + todo comment

7c3c6ac

remove unused function

94248b1

coszio force-pushed the mmr-in-collection branch from 988089e to 94248b1 Compare July 7, 2025 20:53

coderabbitai bot reviewed Jul 7, 2025

View reviewed changes

coszio merged commit 137d71a into dev Jul 7, 2025
18 checks passed

coszio deleted the mmr-in-collection branch July 7, 2025 21:18

coderabbitai bot mentioned this pull request Jul 17, 2025

MMR with postprocessed score #6897

Merged

This was referenced Aug 28, 2025

Move CoreSearchRequest and QueryEnum from collection into shard crate #7169

Merged

make sure that grpc::Vector can always use new format #7185

Merged

coderabbitai bot mentioned this pull request Sep 6, 2025

slow requests log #7188

Merged

This was referenced Oct 15, 2025

Move ShardQueryRequest into shard crate #7410

Merged

Move PlannedQuery and QueryScrollRequestInternal into shard crate #7417

Merged

[relevance feedback] Feedback in CollectionQuery #7386

Merged

coderabbitai bot mentioned this pull request Oct 29, 2025

Implement edge_py::Shard::query #7465

Merged

10 tasks

coderabbitai bot mentioned this pull request Nov 7, 2025

Limit the timeout parameter for all requests to 1h #7500

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MMR] resolve at collection level#6786

[MMR] resolve at collection level#6786
coszio merged 9 commits intodevfrom
mmr-in-collection

coszio commented Jul 1, 2025

Uh oh!

Uh oh!

coderabbitai bot commented Jul 7, 2025

Walkthrough

Possibly related PRs

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

coszio commented Jul 1, 2025

Uh oh!

Uh oh!

coderabbitai bot commented Jul 7, 2025

Walkthrough

Possibly related PRs

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CodeRabbit Configuration File (`.coderabbit.yaml`)