Skip to content

[Bug]: Prompt caching is not enabled for custom Anthropic-compatible providers using api_mode: anthropic_messages #8294

@x-hansong

Description

@x-hansong

Bug Description

Prompt caching is not auto-enabled for third-party Anthropic-compatible providers that use api_mode: anthropic_messages, even when the endpoint supports Anthropic cache_control and the selected model is Claude.

One concrete example is ZenMux's Anthropic-compatible endpoint:

custom_providers:
  zenmux-anthropic:
    base_url: https://zenmux.ai/api/anthropic
    api_key: ${ZENMUX_ANTHROPIC_API_KEY}
    api_mode: anthropic_messages
    models:
      anthropic/claude-haiku-4.5:
        context_length: 200000
      anthropic/claude-sonnet-4.6:
        context_length: 400000
      anthropic/claude-opus-4.6:
        context_length: 400000

With this provider selected in Hermes, prompt caching is not activated.

Expected Behavior

If the runtime is using api_mode: anthropic_messages and the endpoint is Anthropic-compatible, Hermes should be able to enable Anthropic prompt caching for Claude models, or at least offer a way to opt into it for compatible providers.

In this configuration, Hermes should send Anthropic cache breakpoints / cache_control markers the same way it does for native Anthropic.

Actual Behavior

Hermes does not enable prompt caching for the configuration above.

From the current main branch, the gating logic appears to be provider-name based instead of capability based:

# run_agent.py
is_openrouter = self._is_openrouter_url()
is_claude = "claude" in self.model.lower()
is_native_anthropic = self.api_mode == "anthropic_messages" and self.provider == "anthropic"
self._use_prompt_caching = (is_openrouter and is_claude) or is_native_anthropic

The same pattern appears again in model-switch / fallback re-evaluation paths.

This means any named custom provider using api_mode: anthropic_messages is excluded from prompt caching unless the provider name is literally anthropic.

Steps to Reproduce

  1. Configure a custom provider like the ZenMux example above.
  2. Select a Claude model from that provider.
  3. Start a Hermes session and send a few multi-turn prompts.
  4. Observe that prompt caching is not enabled for this provider path.

Why this seems like a Hermes-side issue

ZenMux documents Anthropic prompt caching support on its Anthropic-compatible endpoint and shows cache_control being sent to https://zenmux.ai/api/anthropic.

So the issue seems less about the endpoint lacking support, and more about Hermes never turning on the cache-marker path for third-party anthropic_messages providers.

Suggested Direction

The enabling condition probably needs to move from provider-name matching to capability matching.

For example, something along these lines:

  • api_mode == anthropic_messages
  • model is Claude
  • endpoint is native Anthropic or a compatible /anthropic endpoint, or the provider explicitly declares prompt-cache support

Even an explicit provider-level opt-in would solve this cleanly.

Version / Context

Observed on current Hermes configuration with a custom provider using api_mode: anthropic_messages.

If helpful, I can open a PR for the gating change after the expected behavior is agreed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions