Skip to content

fix: align MiniMax provider with official API docs#7096

Closed
kshitijk4poor wants to merge 2 commits into
NousResearch:mainfrom
kshitijk4poor:fix/minimax-transport-mismatch
Closed

fix: align MiniMax provider with official API docs#7096
kshitijk4poor wants to merge 2 commits into
NousResearch:mainfrom
kshitijk4poor:fix/minimax-transport-mismatch

Conversation

@kshitijk4poor

@kshitijk4poor kshitijk4poor commented Apr 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

Aligns the MiniMax provider implementation with official API documentation. Fixes 6 bugs — transport mismatch, credential leak, prompt caching leak, dot-to-hyphen model name corruption, trajectory compressor URL routing, and stale doctor health check — plus corrects context window, thinking support, max output, and model catalog to match the actual API.

Primary reference: Compatible Anthropic API — MiniMax API Docs

Changes

Bug fix: transport mismatch in providers.py

HERMES_OVERLAYS had transport="openai_chat" for minimax/minimax-cn, but MiniMax's /anthropic endpoint speaks the Anthropic Messages wire format. This caused determine_api_mode("minimax") to return "chat_completions" instead of "anthropic_messages", breaking /model switch and any codepath that resolves api_mode by provider name.

Doc ref: The Quick Start § 2. Configure Environment Variables shows base_url="https://api.minimax.io/anthropic" — an Anthropic-compat endpoint, not OpenAI.

Bug fix: credential leak in switch_model()

switch_model() fell back to resolve_anthropic_token() for ALL anthropic_messages providers. The __init__ path (line 761) correctly guards with _is_native_anthropic = self.provider == "anthropic"switch_model() now mirrors that guard.

Bug fix: prompt caching sent to MiniMax

is_native_anthropic was set from api_mode == "anthropic_messages" alone. This caused Anthropic prompt caching cache_control breakpoints to be injected into requests to MiniMax. Fixed in __init__, switch_model(), and fallback activation to require provider == "anthropic".

Doc ref: The Supported Parameters table does not list cache_control or any caching parameter. The Important Notes section confirms unsupported parameters are silently ignored or cause errors.

Bug fix: dot-to-hyphen corruption in _anthropic_preserve_dots()

MiniMax model IDs contain dots (e.g. MiniMax-M2.7). The Anthropic adapter's normalize_model_name() converts dots to hyphens by default (MiniMax-M2.7MiniMax-M2-7), causing model-not-found errors. Added minimax/minimax-cn to the preserve-dots provider set.

Doc ref: The Supported Models table lists model IDs with dots: MiniMax-M2.7, MiniMax-M2.5.

Bug fix: trajectory compressor URL routing

Passed raw /anthropic URL to OpenAI SDK → /anthropic/chat/completions (404). Now uses the canonical _to_openai_base_url() helper from auxiliary_client.py.

Doc ref: MiniMax exposes both endpoints — Anthropic-compat at /anthropic and OpenAI-compat at /v1. The OpenAI SDK must hit /v1.

Improvement: doctor health check for MiniMax

Previously skipped ("key configured" without testing connectivity). Now uses MiniMax's OpenAI-compat /v1/models endpoint with /anthropic/v1 rewrite via _to_openai_base_url().

Context window: 204,800 tokens

Previous values were 1,000,000 (M1 variants) and 1,048,576 (M2 variants) — both wrong.

Doc ref: The Supported Models table shows Context Window: 204,800 for every model (M2.7, M2.7-highspeed, M2.5, M2.5-highspeed, M2.1, M2.1-highspeed, M2).

Thinking support: enabled (manual mode)

The previous guard blocked thinking for all MiniMax models. MiniMax now correctly gets manual thinking (type: "enabled" + budget_tokens), not adaptive thinking (Claude 4.6-only).

Doc ref: The Supported Parameters table lists thinking as "Fully Supported" with description "Reasoning Content". The Messages Field Support table also shows type: "thinking" as "Fully Supported".

Max output tokens: 131,072

Added explicit entry in _ANTHROPIC_OUTPUT_LIMITS. Previous behavior fell back to the generic 128K default.

Model catalog: M2.7, M2.5, M2.1, M2

Replaced M1 family (not available on /anthropic endpoint) with the actual supported models. Highspeed variants exist but are omitted from the default picker (users can specify them manually).

Doc ref: The Supported Models table lists exactly: MiniMax-M2.7, MiniMax-M2.7-highspeed, MiniMax-M2.5, MiniMax-M2.5-highspeed, MiniMax-M2.1, MiniMax-M2.1-highspeed, MiniMax-M2. No M1 models are listed. The Important Notes section confirms: "The Anthropic API compatibility interface currently only supports the MiniMax-M2.7, MiniMax-M2.7-highspeed, MiniMax-M2.5, MiniMax-M2.5-highspeed, MiniMax-M2.1, MiniMax-M2.1-highspeed, MiniMax-M2 model."

Files changed (10)

File Change
hermes_cli/providers.py transport="anthropic_messages" for minimax/minimax-cn
agent/anthropic_adapter.py MiniMax in _ANTHROPIC_OUTPUT_LIMITS; remove thinking guard
agent/model_metadata.py "minimax": 204800 single prefix entry
hermes_cli/models.py Catalog: M2.7, M2.5, M2.1, M2
hermes_cli/setup.py Match catalog update
run_agent.py Credential guard, prompt caching guard (provider == "anthropic"), dot preservation
trajectory_compressor.py Use _to_openai_base_url() for client init
hermes_cli/doctor.py MiniMax health check via /v1/models
tests/agent/test_minimax_provider.py 9 test classes, 39 total tests
tests/hermes_cli/test_setup_model_selection.py Update expected model lists

Test results

559 passed — full provider/adapter/runtime/setup suite

@kshitijk4poor kshitijk4poor force-pushed the fix/minimax-transport-mismatch branch 2 times, most recently from b48523f to c7022c4 Compare April 10, 2026 10:26
- Fix providers.py transport: minimax/minimax-cn use anthropic_messages,
  not openai_chat. determine_api_mode() now returns the correct api_mode
  for /model switch and other codepaths that resolve by provider name.

- Fix context window: 204,800 tokens per official docs, not 1M/1.05M.
  Removes stale M1 per-variant entries; single 'minimax' prefix entry.

- Enable thinking: MiniMax officially supports the thinking parameter
  (manual mode with budget_tokens). Removes the guard that blocked it.

- Add max output entry: 131,072 tokens in _ANTHROPIC_OUTPUT_LIMITS
  (source: OpenClaw model definitions, confirmed via API behavior).

- Update model catalog: M2.7, M2.5, M2.1, M2 per official Anthropic
  endpoint docs. Removes M1 family (not on /anthropic endpoint).

- Add tests: determine_api_mode(), max output limits, thinking support.
  Updates existing tests to match official docs.

Source: https://platform.minimax.io/docs/api-reference/text-anthropic-api
…reeze

When context compaction fires during a streamed conversation, the
compression LLM call would edit the same Telegram message as the
agent's response.  After compaction completed, the stream consumer
still held state pointing at the old message, causing subsequent
edits to fail and the chat to appear "stuck" or never finish.

Fix: after compression, reset the stream consumer's accumulated text,
message ID, and fallback state.  The next API response streams to a
fresh Telegram message instead of fighting over the old one.

Also wire _stream_consumer_instance from the gateway's stream_consumer
holder into the agent so _compress_context can access it.

Cherry-picked from PR NousResearch#7426 by @dangelo352.
@teknium1

Copy link
Copy Markdown
Contributor

Merged via PR #7126. Your commit was cherry-picked onto current main with your authorship preserved in git log. Thanks for the thorough MiniMax alignment work — all 6 bugs fixed, context windows corrected, and 39 tests added.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants