Skip to content

Bug: 5.28 transport refactor regressed prompt caching for Anthropic and OpenAI-compatible providers #89386

@Enominera

Description

@Enominera

Bug: 5.28 transport refactor regressed prompt caching for both Anthropic and OpenAI-compatible providers

Bug type

Regression (worked in 5.18, broken in 5.28)

Summary

The 5.28 transport layer refactor (src/llm/providers/) lost two critical cache optimizations that were present in 5.18, causing prompt cache hit rate to drop significantly for both Anthropic-compatible and OpenAI-compatible (Completions) providers.

Evidence

  • 5.18: Anthropic-compatible provider ~95% cache hit rate; OpenAI-compatible provider ~95% cache hit rate
  • 5.28: Both dropped to ~75-80% under similar usage patterns (interactive sessions with tool calls)

Root Causes

1. Anthropic: CACHE_BOUNDARY splitting lost

5.18 (anthropic-transport-stream.ts) called applyAnthropicPayloadPolicyToParams which used splitSystemPromptCacheBoundary to split the system prompt into two content blocks:

  • Stable prefix block with cache_control: { type: "ephemeral" } → cached by Anthropic
  • Dynamic suffix block without cache_control → not cached, changes don't invalidate cache

5.28 (anthropic-BND0seXX.js, buildParams at ~L588) no longer calls applyAnthropicPayloadPolicyToParams. The entire system prompt is placed in a single content block with one cache_control marker. When dynamic content after <!-- OPENCLAW_CACHE_BOUNDARY --> changes between turns (e.g., Runtime section with activeProcessSessions, group chat context, messaging hints), Anthropic detects the entire cached prefix as changed and invalidates all cached tokens.

2. OpenAI Completions: prompt_cache_key condition narrowed

5.18 condition (openai-transport-stream.ts):

compat.supportsPromptCacheKey && cacheRetention !== "none" && options?.sessionId

5.28 condition (openai-completions-ghIO5MDN.js buildParams ~L346):

model.baseUrl.includes("api.openai.com") && cacheRetention !== "none" 
  || cacheRetention === "long" && compat.supportsLongCacheRetention

Non-OpenAI providers using openai-completions API (e.g., DeepSeek, OpenRouter) have baseUrl that does NOT contain "api.openai.com", so prompt_cache_key is never sent. These providers rely on prompt_cache_key for prefix-based prompt caching. Without it, every request with a dynamic system prompt suffix misses cache.

3. stripSystemPromptCacheBoundary not called

5.28's OpenAI completions transport (openai-completions-ghIO5MDN.js) does not call stripSystemPromptCacheBoundary on the system prompt before sending. The <!-- OPENCLAW_CACHE_BOUNDARY --> marker text is sent to the API as part of the system message, which is a behavior change from 5.18.

Steps to reproduce

  1. Configure any Anthropic-compatible provider (MiMo, Claude via proxy, etc.)
  2. Start a session and make multiple turns with tool calls (exec, web_search, etc.)
  3. Observe that prompt cache hit rate is significantly lower than on 5.18

Same for any non-OpenAI OpenAI-compatible provider using completions API.

Expected behavior

CACHE_BOUNDARY should be respected by the transport layer:

  • Anthropic: system prompt split into cached stable prefix + uncached dynamic suffix
  • OpenAI Completions: prompt_cache_key forwarded based on compat.supportsPromptCacheKey, not hardcoded URL check

Suggested fix

Fix 1: In anthropic-BND0seXX.js buildParams, apply splitSystemPromptCacheBoundary to system prompt, creating two content blocks (or restore applyAnthropicPayloadPolicyToParams call).

Fix 2: In openai-completions-ghIO5MDN.js buildParams, change prompt_cache_key gate from model.baseUrl.includes("api.openai.com") back to compat.supportsPromptCacheKey.

Fix 3: Call stripSystemPromptCacheBoundary on system prompt in OpenAI completions transport.

OpenClaw version

2026.5.28 (commit e932160)

Operating system

macOS

Install method

npm global

Metadata

Metadata

Assignees

Labels

P2Normal backlog priority with limited blast radius.clawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.impact:otherThis issue has meaningful maintainer-visible impact outside the owned taxonomy.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions