fix(agent): enable prompt caching for MiniMax own models on anthropic_messages transport#17333
Closed
nicoechaniz wants to merge 3 commits into
Closed
fix(agent): enable prompt caching for MiniMax own models on anthropic_messages transport#17333nicoechaniz wants to merge 3 commits into
nicoechaniz wants to merge 3 commits into
Conversation
…_messages transport PR NousResearch#12846 enabled Anthropic prompt caching for third-party gateways, but gated it on is_claude, which excluded providers like MiniMax that serve their own model families (MiniMax-M2.7, etc.) through the native Anthropic protocol. MiniMax documents full cache_control support on its /anthropic endpoints (global and China). This patch adds MiniMax detection to _anthropic_prompt_cache_policy() using: - Built-in provider id (minimax, minimax-cn), or - Known Anthropic-compatible hostname (api.minimax.io, api.minimaxi.com) Both paths receive the native cache_control layout. Refs: NousResearch#8294 (related, but only covered Claude-named models on third-party gateways). Closes NousResearch#17332
nicoechaniz
added a commit
to nicoechaniz/hermes-agent
that referenced
this pull request
Apr 29, 2026
AIAgent.__init__ now detects provider=minimax/minimax-cn and defaults to: - api_mode='anthropic_messages' (was 'chat_completions') - base_url='https://api.minimax.io/anthropic' or 'https://api.minimaxi.com/anthropic' This ensures prompt caching (and all other Anthropic-protocol features) work out of the box for AIAgent users, not just CLI users. Previously, AIAgent(provider='minimax') fell through to chat_completions because base_url was empty and there was no provider-name detection for MiniMax in the api_mode resolution logic. The CLI already resolved this correctly via runtime_provider.py; this change mirrors that behaviour in the low-level agent constructor. Tests added: - 5 new tests in test_minimax_provider.py covering defaults, cn variant, explicit base_url preservation, explicit api_mode override, and prompt caching enabled by default. - 2 new tests in test_anthropic_prompt_cache_policy.py covering empty base_url with provider=minimax/minimax-cn.
c481721 to
412d505
Compare
…hing The template was suggesting https://api.minimax.io/v1 as the override example, which silently disables prompt caching since MiniMax only supports caching on the /anthropic endpoint. Update both global and China examples to /anthropic and add a comment explaining why. Refs: NousResearch#17332
Contributor
Author
|
Update: Also fixed — the template was suggesting , which silently disables prompt caching because MiniMax only supports caching on the endpoint. Updated both global and China examples to with an explanatory comment. Commit: |
This was referenced May 1, 2026
Sanjays2402
added a commit
to Sanjays2402/hermes-agent
that referenced
this pull request
May 2, 2026
…ousResearch#17332) Pre-fix, the gate in _anthropic_prompt_cache_policy enabled caching on api_mode=anthropic_messages only when the model name contained 'claude'. That worked for third-party gateways serving Claude (LiteLLM proxy, Zhipu GLM Anthropic-compat) but silently disabled caching for providers that use the native Anthropic protocol for their *own* model families (MiniMax-M2.x). PR NousResearch#17333 fixes the symptom by hardcoding 'minimax' / 'minimax-cn' / two specific hostnames. The issue's 'Scope beyond MiniMax' section explicitly asks for a 'capability-based or provider-allowlist approach' that's future-proof. This PR delivers exactly that: 1. ProviderConfig.extra carries an opt-in flag {anthropic_cache: True, anthropic_cache_hosts: (\u2026)} for built-in providers that document cache_control support for their own models. Currently set on 'minimax' and 'minimax-cn'. Future entrants flip the flag in one place. 2. agent/anthropic_cache_capability.py exposes provider_supports_anthropic_cache(provider, base_url, user_configured_hosts=\u2026) which resolves in this order: a. ProviderConfig.extra['anthropic_cache'] for the provider id (case-insensitive). b. Hostname match against the union of registry-declared hosts and operator-configured hosts \u2014 catches provider=='custom' setups pointing at MiniMax's URL. 3. New user config knob agent.anthropic_cache_hosts: [\u2026]. Operators can opt-in their own private gateway without waiting for an upstream patch. Read once in __init__ into self._anthropic_cache_user_hosts_cached, threaded through the policy lookup. Malformed values are ignored \u2014 a typo never *disables* caching, only fails to *enable* it. 4. _anthropic_prompt_cache_policy gains a new branch that fires only when api_mode=='anthropic_messages' and the helper returns True. Pre-existing native Anthropic / OpenRouter / Claude-on-third-party branches are unchanged, so this is purely additive for the non-Claude case. Why this is better than NousResearch#17333 - One brand-name special-case per provider isn't going to age well. Capability registry + config opt-in handle MiniMax now and the next entrant for free. - Operator escape hatch (anthropic_cache_hosts) lets users running private LiteLLM / vLLM-anthropic deployments enable caching today without forking the repo. - Eliminates the pre-existing 'Claude substring matches non-claude' gotcha by encouraging callers to test against curated host/provider lists, not free-text model names. - Single source of truth: built-in MiniMax support reads from the same ProviderConfig that already carries auth/base_url \u2014 no parallel list to keep in sync. Tests - TestCapabilityRegistryAnthropicCache (9 cases): MiniMax via provider id, MiniMax via host (custom config), MiniMax-CN both ways, user- configured opt-in, case/whitespace normalization, transport gate, unlisted gateway stays off, case-insensitive provider id. - TestCapabilityHelperUnit (5 cases): direct tests of the helper. - Existing test test_third_party_without_claude_name_does_not_cache was using api.minimax.io as the 'unknown' host \u2014 updated to use a truly unknown host to reflect the new behavior, which is exactly what the issue calls for. 863 passed, 7 skipped on tests/run_agent/ (full sweep, no regressions). 30 passed on the policy file (15 pre-existing + 14 new + 1 updated).
Fede654
pushed a commit
to Fede654/hermes-agent
that referenced
this pull request
May 3, 2026
…base_url PR NousResearch#17425 (merged) enabled prompt caching for MiniMax models on the anthropic_messages transport, but users still had to manually configure both api_mode and base_url to actually benefit from it. This patch makes the defaults ergonomic: - AIAgent.__init__ now auto-detects provider=minimax / minimax-cn and defaults to api_mode=anthropic_messages + the correct /anthropic base_url (global or China endpoint respectively). - .env.example suggests the /anthropic endpoints instead of /v1. - Explicit base_url or api_mode are preserved when the user sets them. Tests: 5 new cases covering both providers, explicit overrides, and prompt-caching flags. Refs: NousResearch#17332, NousResearch#17333, NousResearch#17425
nicoechaniz
added a commit
to nicoechaniz/hermes-agent
that referenced
this pull request
May 24, 2026
…base_url PR NousResearch#17425 (merged) enabled prompt caching for MiniMax models on the anthropic_messages transport, but users still had to manually configure both api_mode and base_url to actually benefit from it. This patch makes the defaults ergonomic: - AIAgent.__init__ now auto-detects provider=minimax / minimax-cn and defaults to api_mode=anthropic_messages + the correct /anthropic base_url (global or China endpoint respectively). - .env.example suggests the /anthropic endpoints instead of /v1. - Explicit base_url or api_mode are preserved when the user sets them. Tests: 5 new cases covering both providers, explicit overrides, and prompt-caching flags. Refs: NousResearch#17332, NousResearch#17333, NousResearch#17425
nicoechaniz
added a commit
to nicoechaniz/hermes-agent
that referenced
this pull request
Jun 1, 2026
…base_url PR NousResearch#17425 (merged) enabled prompt caching for MiniMax models on the anthropic_messages transport, but users still had to manually configure both api_mode and base_url to actually benefit from it. This patch makes the defaults ergonomic: - AIAgent.__init__ now auto-detects provider=minimax / minimax-cn and defaults to api_mode=anthropic_messages + the correct /anthropic base_url (global or China endpoint respectively). - .env.example suggests the /anthropic endpoints instead of /v1. - Explicit base_url or api_mode are preserved when the user sets them. Tests: 5 new cases covering both providers, explicit overrides, and prompt-caching flags. Refs: NousResearch#17332, NousResearch#17333, NousResearch#17425
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #17332
PR #12846 enabled Anthropic prompt caching for third-party gateways,
but gated it on
is_claude, which excluded providers like MiniMaxthat serve their own model families (MiniMax-M2.7, etc.) through the
native Anthropic protocol.
MiniMax documents full
cache_controlsupport on its/anthropicendpoints (global and China). This patch adds MiniMaxdetection to
_anthropic_prompt_cache_policy()using:minimax,minimax-cn), orapi.minimax.io,api.minimaxi.com)Both paths receive the native
cache_controllayout, consistent withhow Hermes already treats native Anthropic and Claude-on-third-party
gateways.
Tests: 3 new cases, 19 total in the policy suite.
Refs: #8294 (related, but only covered Claude-named models on
third-party gateways).