fix(agent): enable prompt caching for MiniMax own models on anthropic_messages transport by nicoechaniz · Pull Request #17333 · NousResearch/hermes-agent

nicoechaniz · 2026-04-29T07:45:54Z

PR #12846 enabled Anthropic prompt caching for third-party gateways,
but gated it on is_claude, which excluded providers like MiniMax
that serve their own model families (MiniMax-M2.7, etc.) through the
native Anthropic protocol.

MiniMax documents full cache_control support on its
/anthropic endpoints (global and China). This patch adds MiniMax
detection to _anthropic_prompt_cache_policy() using:

Built-in provider id (minimax, minimax-cn), or
Known Anthropic-compatible hostname (api.minimax.io,
api.minimaxi.com)

Both paths receive the native cache_control layout, consistent with
how Hermes already treats native Anthropic and Claude-on-third-party
gateways.

Tests: 3 new cases, 19 total in the policy suite.

Refs: #8294 (related, but only covered Claude-named models on
third-party gateways).

…_messages transport PR NousResearch#12846 enabled Anthropic prompt caching for third-party gateways, but gated it on is_claude, which excluded providers like MiniMax that serve their own model families (MiniMax-M2.7, etc.) through the native Anthropic protocol. MiniMax documents full cache_control support on its /anthropic endpoints (global and China). This patch adds MiniMax detection to _anthropic_prompt_cache_policy() using: - Built-in provider id (minimax, minimax-cn), or - Known Anthropic-compatible hostname (api.minimax.io, api.minimaxi.com) Both paths receive the native cache_control layout. Refs: NousResearch#8294 (related, but only covered Claude-named models on third-party gateways). Closes NousResearch#17332

…search#17333)

AIAgent.__init__ now detects provider=minimax/minimax-cn and defaults to: - api_mode='anthropic_messages' (was 'chat_completions') - base_url='https://api.minimax.io/anthropic' or 'https://api.minimaxi.com/anthropic' This ensures prompt caching (and all other Anthropic-protocol features) work out of the box for AIAgent users, not just CLI users. Previously, AIAgent(provider='minimax') fell through to chat_completions because base_url was empty and there was no provider-name detection for MiniMax in the api_mode resolution logic. The CLI already resolved this correctly via runtime_provider.py; this change mirrors that behaviour in the low-level agent constructor. Tests added: - 5 new tests in test_minimax_provider.py covering defaults, cn variant, explicit base_url preservation, explicit api_mode override, and prompt caching enabled by default. - 2 new tests in test_anthropic_prompt_cache_policy.py covering empty base_url with provider=minimax/minimax-cn.

…hing The template was suggesting https://api.minimax.io/v1 as the override example, which silently disables prompt caching since MiniMax only supports caching on the /anthropic endpoint. Update both global and China examples to /anthropic and add a comment explaining why. Refs: NousResearch#17332

nicoechaniz · 2026-04-29T10:04:00Z

Update: Also fixed — the template was suggesting , which silently disables prompt caching because MiniMax only supports caching on the endpoint. Updated both global and China examples to with an explanatory comment.

Commit:

…ousResearch#17332) Pre-fix, the gate in _anthropic_prompt_cache_policy enabled caching on api_mode=anthropic_messages only when the model name contained 'claude'. That worked for third-party gateways serving Claude (LiteLLM proxy, Zhipu GLM Anthropic-compat) but silently disabled caching for providers that use the native Anthropic protocol for their *own* model families (MiniMax-M2.x). PR NousResearch#17333 fixes the symptom by hardcoding 'minimax' / 'minimax-cn' / two specific hostnames. The issue's 'Scope beyond MiniMax' section explicitly asks for a 'capability-based or provider-allowlist approach' that's future-proof. This PR delivers exactly that: 1. ProviderConfig.extra carries an opt-in flag {anthropic_cache: True, anthropic_cache_hosts: (\u2026)} for built-in providers that document cache_control support for their own models. Currently set on 'minimax' and 'minimax-cn'. Future entrants flip the flag in one place. 2. agent/anthropic_cache_capability.py exposes provider_supports_anthropic_cache(provider, base_url, user_configured_hosts=\u2026) which resolves in this order: a. ProviderConfig.extra['anthropic_cache'] for the provider id (case-insensitive). b. Hostname match against the union of registry-declared hosts and operator-configured hosts \u2014 catches provider=='custom' setups pointing at MiniMax's URL. 3. New user config knob agent.anthropic_cache_hosts: [\u2026]. Operators can opt-in their own private gateway without waiting for an upstream patch. Read once in __init__ into self._anthropic_cache_user_hosts_cached, threaded through the policy lookup. Malformed values are ignored \u2014 a typo never *disables* caching, only fails to *enable* it. 4. _anthropic_prompt_cache_policy gains a new branch that fires only when api_mode=='anthropic_messages' and the helper returns True. Pre-existing native Anthropic / OpenRouter / Claude-on-third-party branches are unchanged, so this is purely additive for the non-Claude case. Why this is better than NousResearch#17333 - One brand-name special-case per provider isn't going to age well. Capability registry + config opt-in handle MiniMax now and the next entrant for free. - Operator escape hatch (anthropic_cache_hosts) lets users running private LiteLLM / vLLM-anthropic deployments enable caching today without forking the repo. - Eliminates the pre-existing 'Claude substring matches non-claude' gotcha by encouraging callers to test against curated host/provider lists, not free-text model names. - Single source of truth: built-in MiniMax support reads from the same ProviderConfig that already carries auth/base_url \u2014 no parallel list to keep in sync. Tests - TestCapabilityRegistryAnthropicCache (9 cases): MiniMax via provider id, MiniMax via host (custom config), MiniMax-CN both ways, user- configured opt-in, case/whitespace normalization, transport gate, unlisted gateway stays off, case-insensitive provider id. - TestCapabilityHelperUnit (5 cases): direct tests of the helper. - Existing test test_third_party_without_claude_name_does_not_cache was using api.minimax.io as the 'unknown' host \u2014 updated to use a truly unknown host to reflect the new behavior, which is exactly what the issue calls for. 863 passed, 7 skipped on tests/run_agent/ (full sweep, no regressions). 30 passed on the policy file (15 pre-existing + 14 new + 1 updated).

…base_url PR NousResearch#17425 (merged) enabled prompt caching for MiniMax models on the anthropic_messages transport, but users still had to manually configure both api_mode and base_url to actually benefit from it. This patch makes the defaults ergonomic: - AIAgent.__init__ now auto-detects provider=minimax / minimax-cn and defaults to api_mode=anthropic_messages + the correct /anthropic base_url (global or China endpoint respectively). - .env.example suggests the /anthropic endpoints instead of /v1. - Explicit base_url or api_mode are preserved when the user sets them. Tests: 5 new cases covering both providers, explicit overrides, and prompt-caching flags. Refs: NousResearch#17332, NousResearch#17333, NousResearch#17425

nicoechaniz mentioned this pull request Apr 29, 2026

[Bug]: Prompt caching silently disabled for MiniMax's own models (MiniMax-M2.7 etc.) on anthropic_messages transport #17332

Closed

nicoechaniz added a commit to nicoechaniz/hermes-agent that referenced this pull request Apr 29, 2026

Merge feat/minimax-prompt-caching: MiniMax prompt caching fix (NousRe…

0150e89

…search#17333)

alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder provider/minimax MiniMax (Anthropic transport) labels Apr 29, 2026

Sanjays2402 mentioned this pull request Apr 29, 2026

fix(agent): capability registry for Anthropic prompt-cache support (#17332) #17339

Open

nicoechaniz force-pushed the feat/minimax-prompt-caching branch from c481721 to 412d505 Compare April 29, 2026 08:38

This was referenced May 1, 2026

fix(agent): default MiniMax provider to anthropic_messages + correct base_url #18147

Open

fix(minimax): enable Anthropic prompt caching for MiniMax's own models #17425

Merged

nicoechaniz closed this May 1, 2026

nicoechaniz deleted the feat/minimax-prompt-caching branch May 1, 2026 01:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): enable prompt caching for MiniMax own models on anthropic_messages transport#17333

fix(agent): enable prompt caching for MiniMax own models on anthropic_messages transport#17333
nicoechaniz wants to merge 3 commits into
NousResearch:mainfrom
nicoechaniz:feat/minimax-prompt-caching

nicoechaniz commented Apr 29, 2026

Uh oh!

nicoechaniz commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nicoechaniz commented Apr 29, 2026

Uh oh!

nicoechaniz commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants