Skip to content

feat(reranker): support OpenAI/Cohere-compatible custom reranker endpoint #5149

@yaozcoderepo

Description

@yaozcoderepo

Component

Core / Python SDK

Description

Use Case

The current reranker layer only supports a fixed set of providers — cohere, sentence_transformer, zero_entropy, llm_reranker, and huggingface (see mem0/utils/factory.py). There is no way to point mem0 at a self-hosted or third-party reranker service that exposes a Cohere/Jina-compatible HTTP /rerank endpoint (e.g. self-deployed bge-reranker, Jina Reranker, Voyage, SiliconFlow, Together, vLLM-hosted rerankers, internal corporate gateways, etc.).

Concretely, the current limitations are:

  1. RerankerFactory.provider_to_class is a hardcoded allow-list — anything else raises Unsupported reranker provider (factory.py:221-246).
  2. BaseRerankerConfig exposes only provider / model / api_key / top_k; there is no base_url / endpoint field (base.py:5-17).
  3. RerankerConfig sets model_config = {"extra": "forbid"} (config.py:12), so users can't even sneak extra fields through.
  4. The existing CohereReranker constructs cohere.Client(api_key) without a base URL (cohere_reranker.py:32), so it can't be repointed at a Cohere-compatible self-hosted gateway either.

As a result, configuring a custom reranker via the /configure REST endpoint (or via MemoryConfig in code) is impossible today, even though the equivalent already works on the LLM/embedding side via openai_base_url / custom OpenAI-compatible providers.

Proposed Solution

Add a new provider, e.g. openai_compatible (or custom), that speaks the standard rerank HTTP contract:

POST {base_url}/rerank
Authorization: Bearer {api_key}
{
  "model": "...",
  "query": "...",
  "documents": ["...", "..."],
  "top_n": N
}

Concrete changes:

  1. mem0/configs/rerankers/openai_compatible.py — new config class extending BaseRerankerConfig with base_url: str and (optional) timeout, headers.
  2. mem0/reranker/openai_compatible_reranker.py — new reranker that POSTs to {base_url}/rerank using httpx, maps results[].index and relevance_score back onto the input docs (same shape as CohereReranker.rerank returns).
  3. Register "openai_compatible" in RerankerFactory.provider_to_class in factory.py:221.
  4. Tests under tests/reranker/openai_compatible/ mocking the HTTP call.
  5. Doc page under docs/components/rerankers/ (with Use when ... description, and an entry in docs/llms.txt).

Example usage after the change:

config = {
    "reranker": {
        "provider": "openai_compatible",
        "config": {
            "base_url": "https://my-reranker.internal/v1",
            "api_key": "sk-...",
            "model": "bge-reranker-v2-m3",
            "top_k": 10,
        },
    },
}
m = Memory.from_config(config)

And the equivalent via the server's POST /configure:

{
  "reranker": {
    "provider": "openai_compatible",
    "config": {
      "base_url": "https://my-reranker.internal/v1",
      "api_key": "sk-...",
      "model": "bge-reranker-v2-m3"
    }
  }
}

I'd be happy to send a PR implementing this — please let me know if openai_compatible is the preferred name (vs custom, http, generic)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions