Skip to content

[Bug]: MiniMax auxiliary/compression clients route to wrong endpoint (/anthropic instead of /v1) #5781

@exxmen

Description

@exxmen

Bug Description

When MiniMax is the configured provider, the compression/auxiliary client sends requests to https://api.minimax.io/anthropic. That endpoint only accepts Anthropic Messages format. The auxiliary client uses the OpenAI SDK (chat completions format), which needs https://api.minimax.io/v1.

Every compression call returns 404. The fallback model handles compression instead — not guaranteed to be equivalent in quality or cost.

Steps to Reproduce

  1. Set MiniMax as primary provider (model.provider=minimax, model.base_url=https://api.minimax.io/anthropic)
  2. Run a conversation long enough to trigger context compression
  3. Check gateway log for: Non-retryable client error: <html><head><title>404 Not Found</title></head>

Expected Behavior

Compression hits https://api.minimax.io/v1/chat/completions and succeeds.

Actual Behavior

POST https://api.minimax.io/anthropic/chat/completions returns 404. Context summary is skipped. Fallback model activates.

Affected Component

Agent Core (conversation loop, context compression, memory)

OS

Ubuntu 24.04

Python Version

3.11.x

Hermes Version

(v0.7.x)

Relevant Logs / Traceback

2026-04-07 07:25:47,445 ERROR root: Non-retryable client error: <html>
<head><title>404 Not Found</title></head>
<body><center><h1>404 Not Found</h1></center><hr><center>nginx</center>
</body></html>

Root Cause Analysis

hermes_cli/auth.py line 148:

"minimax": ProviderConfig(
    inference_base_url="https://api.minimax.io/anthropic",
    ...
)

Correct for the main agent loop. The main loop sends Anthropic Messages format to /anthropic/v1/messages.

agent/auxiliary_client.py uses the same inference_base_url to construct an OpenAI SDK client:

return OpenAI(api_key=api_key, base_url=base_url), model

The SDK appends /chat/completions, so requests go to https://api.minimax.io/anthropic/chat/completions — 404.

The /v1 path works for chat completions. Verified with curl:

POST https://api.minimax.io/anthropic/chat/completions → 404 (nginx)
POST https://api.minimax.io/v1/chat/completions → non-404 (processes request)

Proposed Fix

Option A (auxiliary_client.py): Strip /anthropic from base URLs before constructing OpenAI SDK clients:

if "/anthropic" in (base_url or "").rstrip("/"):
    base_url = base_url.rstrip("/").replace("/anthropic", "/v1")

Option B (auth.py + auxiliary_client.py): Add auxiliary_base_url to ProviderConfig. Auxiliary client checks it before falling back to inference_base_url:

"minimax": ProviderConfig(
    inference_base_url="https://api.minimax.io/anthropic",
    auxiliary_base_url="https://api.minimax.io/v1",
    ...
)

Option C (auxiliary_client.py): When summary_base_url is set in config.yaml, use it directly for compression tasks instead of going through the provider registry.

Related Issues

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions