Skip to content

Restructure usage examples#3385

Merged
Samoed merged 8 commits into
v2.0.0from
two_stage_rerank
Oct 17, 2025
Merged

Restructure usage examples#3385
Samoed merged 8 commits into
v2.0.0from
two_stage_rerank

Conversation

@Samoed

@Samoed Samoed commented Oct 16, 2025

Copy link
Copy Markdown
Member
  1. Moved Cached embeddings and Two stage reranking to the advanced usage section
  2. Added alternative to some blocks python/cli
  3. Deleted example with late interaction, because it shouldn't work because late interaction uses its own search interface

@Samoed Samoed added documentation Improvements or additions to documentation v2 labels Oct 16, 2025
@Samoed Samoed linked an issue Oct 16, 2025 that may be closed by this pull request

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds documentation for implementing two-stage reranking in MTEB, explaining how to combine initial retrieval with subsequent reranking for improved search quality.

Key Changes:

  • New documentation file explaining the two-stage reranking approach
  • Code examples demonstrating the workflow: initial retrieval with an encoder model followed by reranking with a cross-encoder model

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread docs/usage/two_stage_reranking.md Outdated
Comment thread docs/usage/two_stage_reranking.md Outdated
Samoed and others added 3 commits October 16, 2025 13:56
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Added comment for clarity on model usage.
Comment thread docs/usage/two_stage_reranking.md Outdated
Comment thread docs/usage/two_stage_reranking.md Outdated
Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

@KennethEnevoldsen KennethEnevoldsen left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this basically what is already here:

https://embeddings-benchmark.github.io/mteb/usage/running_the_evaluation/#running-cross-encoders-on-reranking

(We could restructure that section)

@Samoed

Samoed commented Oct 17, 2025

Copy link
Copy Markdown
Member Author

I didn't look though this doc. I think we can split it into multiple section for visibility

@KennethEnevoldsen

Copy link
Copy Markdown
Contributor

Yeah agree. We could add an "advanced usage" section below "usage" with subheadings like "Late-interaction" and "Two-stage retrieval"?

@Samoed Samoed changed the title Add two stage reranking doc Restructure usage examples Oct 17, 2025
@Samoed

Samoed commented Oct 17, 2025

Copy link
Copy Markdown
Member Author
  1. Moved Cached embeddings and Two stage reranking to the advanced usage section
  2. Added alternative to some blocks python/cli
  3. Deleted example with late interaction, because it shouldn't work because late interaction uses its own search interface

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

# stage 2: reranking
# if model implemented in mteb it's better to use `mteb.get_model`
# cross_encoder = mteb.get_model("jinaai/jina-reranker-v2-base-multilingual")
# or if models wasn't implemented you can pass CrossEncoder directly

Copilot AI Oct 17, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected grammar: 'wasn't' should be 'isn't' for proper present tense.

Suggested change
# or if models wasn't implemented you can pass CrossEncoder directly
# or if models aren't implemented you can pass CrossEncoder directly

Copilot uses AI. Check for mistakes.
Comment thread docs/advanced_usage/two_stage_reranking.md Outdated
Comment thread docs/advanced_usage/two_stage_reranking.md Outdated

@KennethEnevoldsen KennethEnevoldsen left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh CLI alternatives are great - we could probably redo some of the CLI documentation to be more integrated with the rest

Comment thread docs/advanced_usage/two_stage_reranking.md Outdated
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@Samoed Samoed merged commit d410695 into v2.0.0 Oct 17, 2025
10 checks passed
@Samoed Samoed deleted the two_stage_rerank branch October 17, 2025 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add two stage raranking to the usage examples

4 participants