Bug Description
PR #12846 (which closed #8294) enabled Anthropic prompt caching for third-party gateways using api_mode: anthropic_messages, but only when the model name contains "claude". This leaves out providers that use the native Anthropic protocol for their own model families.
MiniMax is a concrete example. It serves MiniMax-M2.7, MiniMax-M2.5, etc. through https://api.minimax.io/anthropic (and api.minimaxi.com/anthropic for China). MiniMax documents explicit cache_control support on this endpoint:
Currently, _anthropic_prompt_cache_policy() returns (False, False) for MiniMax because is_claude is False for model="minimax-m2.7".
Expected Behavior
MiniMax (both built-in minimax / minimax-cn providers and custom endpoints pointing at MiniMax's Anthropic URLs) should receive cache_control breakpoints with the native layout, just like native Anthropic and Claude-on-third-party gateways.
Actual Behavior
No caching. The system prompt + tool schemas are re-billed in full on every turn.
Why this is a Hermes-side issue
MiniMax's Anthropic-compatible endpoint is already handled correctly for auth (Bearer token, stripped betas) and transport (anthropic_messages). The only missing piece is the cache-policy gate in _anthropic_prompt_cache_policy().
Scope beyond MiniMax
This same pattern likely affects any future provider that:
- Speaks the native Anthropic protocol (
anthropic_messages)
- Documents
cache_control support
- Serves its own (non-Claude) model family
A capability-based or provider-allowlist approach would be more future-proof than the current is_claude name check for third-party gateways.
Version
Current upstream main (post #12846).
Bug Description
PR #12846 (which closed #8294) enabled Anthropic prompt caching for third-party gateways using
api_mode: anthropic_messages, but only when the model name contains "claude". This leaves out providers that use the native Anthropic protocol for their own model families.MiniMax is a concrete example. It serves
MiniMax-M2.7,MiniMax-M2.5, etc. throughhttps://api.minimax.io/anthropic(andapi.minimaxi.com/anthropicfor China). MiniMax documents explicitcache_controlsupport on this endpoint:Currently,
_anthropic_prompt_cache_policy()returns(False, False)for MiniMax becauseis_claudeisFalseformodel="minimax-m2.7".Expected Behavior
MiniMax (both built-in
minimax/minimax-cnproviders and custom endpoints pointing at MiniMax's Anthropic URLs) should receivecache_controlbreakpoints with the native layout, just like native Anthropic and Claude-on-third-party gateways.Actual Behavior
No caching. The system prompt + tool schemas are re-billed in full on every turn.
Why this is a Hermes-side issue
MiniMax's Anthropic-compatible endpoint is already handled correctly for auth (Bearer token, stripped betas) and transport (
anthropic_messages). The only missing piece is the cache-policy gate in_anthropic_prompt_cache_policy().Scope beyond MiniMax
This same pattern likely affects any future provider that:
anthropic_messages)cache_controlsupportA capability-based or provider-allowlist approach would be more future-proof than the current
is_claudename check for third-party gateways.Version
Current upstream main (post #12846).