Skip to content

[Feature]: Add supportsPromptCacheKey to Mistral transport compat patch #83709

@Net-Sentinel

Description

@Net-Sentinel

Summary

Mistral's API supports prompt caching via a prompt_cache_key request field — cached tokens are billed at 10% of the standard input price. However, OpenClaw's Mistral provider compat layer (MISTRAL_MODEL_TRANSPORT_PATCH) does not include supportsPromptCacheKey: true, so the transport layer never injects prompt_cache_key into Mistral requests, and no caching occurs.

Current behaviour

MISTRAL_MODEL_TRANSPORT_PATCH in extensions/mistral currently contains:

const MISTRAL_MODEL_TRANSPORT_PATCH = {
  supportsStore: false,
  maxTokensField: "max_tokens"
};

The transport layer in openai-transport-stream gates prompt_cache_key injection on compat.supportsPromptCacheKey === true:

if (compat.supportsPromptCacheKey && cacheRetention !== "none" && options?.sessionId)
  params.prompt_cache_key = options.sessionId;

Because supportsPromptCacheKey is absent from the Mistral patch, this branch never fires. Mistral models receive no prompt_cache_key, and usage.prompt_tokens_details.cached_tokens always returns 0.

This affects all Mistral models routed through the api.mistral.ai endpoint: mistral-medium-latest, mistral-large-latest, mistral-small-latest, etc.

Expected behaviour

When a user sets cacheRetention on a Mistral model (or when a non-none retention default applies), OpenClaw should pass prompt_cache_key with the session ID on Mistral chat completion requests, matching the behaviour already implemented for other providers.

Proposed fix

Add supportsPromptCacheKey: true to MISTRAL_MODEL_TRANSPORT_PATCH:

const MISTRAL_MODEL_TRANSPORT_PATCH = {
  supportsStore: false,
  maxTokensField: "max_tokens",
  supportsPromptCacheKey: true   // Mistral supports prompt_cache_key; cached tokens billed at 10% of input price
};

Mistral's caching docs confirm the field is supported and the billing model: https://docs.mistral.ai/studio-api/conversations/advanced/prompt-caching

Key implementation details from their docs:

  • prompt_cache_key is a top-level field on the chat completion request body
  • Cache blocks are 64 tokens minimum
  • Cached tokens reported in usage.prompt_tokens_details.cached_tokens
  • Cache hits are not guaranteed — they're best-effort on a shared prefix
  • Cached tokens billed at 10% of standard input price

Impact

For agent workloads that resend large, stable context (system prompts, workspace files, conversation history) on every turn — which is the standard OpenClaw heartbeat pattern — the savings are significant. A 1,000-token system prompt resent across 50 turns per day at mistral-medium-latest prices ($0.40/M) costs ~$0.02/day uncached vs ~$0.002/day cached. At scale across multiple agents this adds up.

Verification

Confirmed by inspecting the installed dist on v2026.5.12:

  • MISTRAL_MODEL_TRANSPORT_PATCH in dist/api-CgjdAt3h.js — no supportsPromptCacheKey
  • Transport gate in dist/openai-transport-stream-BWwvx0MZ.js — confirmed gated on compat.supportsPromptCacheKey === true
  • Agent using mistral-medium-2508 as primary — cached_tokens consistently 0 in usage

Happy to test the fix if a build is available.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Normal backlog priority with limited blast radius.clawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions