fix(cache): honour explicit cacheRetention for OpenRouter→Anthropic models#79370
fix(cache): honour explicit cacheRetention for OpenRouter→Anthropic models#79370mene-crab wants to merge 2 commits into
Conversation
|
Codex review: needs real behavior proof before merge. Reviewed June 1, 2026, 1:10 AM ET / 05:10 UTC. Summary PR surface: Source +63, Tests +207. Total +270 across 7 files. Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path. Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Risk before merge
Maintainer options:
Next step before merge
Review detailsBest possible solution: Retry the Codex review after fixing the execution failure. Do we have a high-confidence way to reproduce the issue? Unclear. The review failed before ClawSweeper could establish a reproduction path. Is this the best way to solve the issue? Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction. AGENTS.md: unclear because the file could not be read completely. Codex review notes: model gpt-5.5, reasoning high; reviewed against c0195f7ed579. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +63, Tests +207. Total +270 across 7 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
Real Behavior ProofReproductionSetup: OpenClaw v2026.5.4, OpenRouter→Anthropic model with Actual model config (from openclaw.json): "openrouter/anthropic/claude-haiku-4.5": {
"alias": "Haiku",
"params": {
"cacheRetention": "long"
}
}The Before (unpatched v2026.5.4)Captured request payload via HTTPS proxy dump (OpenRouter The pi-ai SDK places cache_control on the last user message and last tool definition. OpenClaw's After (patched, same config)All 3 markers now include Why the bug happens (trace)
After patch, step 1 returns Cost impactWith
|
4f0ab92 to
5b5445d
Compare
5b5445d to
3f25c10
Compare
…odels
cacheRetention: "long" was silently ignored for models routed through
OpenRouter, producing only 5-minute ephemeral cache markers instead of
the expected 1-hour TTL. cacheRetention: "none" was also ignored.
Two issues in the chain:
1. resolveCacheRetention() bailed out early when the provider didn't
match a known Anthropic/Google cache family, even when the operator
had explicitly configured cacheRetention in model params. The
explicit-config checks now run before the family-based early return,
but only for eligible providers (Anthropic family, Google, or
verified OpenRouter→Anthropic routes where provider is "openrouter"
and the model ref starts with "anthropic/").
2. createOpenRouterSystemCacheWrapper() now accepts optional extraParams
and resolves cacheRetention internally after confirming the request
targets a verified OpenRouter→Anthropic route (using the same
endpoint-class + model-ref boundary as marker insertion). This
ensures custom provider ids pointing to openrouter.ai are covered.
3. applyAnthropicEphemeralCacheControlMarkers() always hardcoded
{ type: "ephemeral" }, discarding any TTL the wrapper resolved.
It now accepts an optional cacheControl parameter (defaults to
{ type: "ephemeral" } for backward compat) and a
skipMarkerInsertion flag for the "none" case.
When cacheRetention is "none", no new cache_control markers are
inserted on system/developer messages, but the thinking-block
sanitizer (stripping stale cache_control from thinking/redacted_thinking
blocks) still runs.
Before: OpenRouter→Anthropic requests always got cache_control without
ttl (or with markers despite "none"), regardless of config.
After: cacheRetention: "long" → { type: "ephemeral", ttl: "1h" };
cacheRetention: "none" → no new markers, sanitizer still active.
Address review finding: resolveCacheRetention() previously checked only provider === "openrouter", missing the baseUrl/endpoint-class gate used by createOpenRouterSystemCacheWrapper. If a user repointed the OpenRouter provider at a custom OpenAI-compatible proxy, the system wrapper would correctly skip cache markers, but resolveCacheRetention would still pass cacheRetention through to pi-ai, which then emitted cache_control with ttl for that proxy path. Now resolveCacheRetention accepts an optional baseUrl parameter and uses resolveProviderRequestPolicy to determine endpointClass, mirroring the same gate as the system wrapper. Added regression tests for: - OpenRouter provider on non-OpenRouter baseUrl → undefined - OpenRouter provider on default (unset) baseUrl → honoured - OpenRouter provider with openrouter.ai baseUrl → honoured
|
This pull request has been automatically marked as stale due to inactivity. |
Problem
When
cacheRetention: "long"is explicitly configured for Anthropic models routed through OpenRouter, the resulting requests carrycache_control: { type: "ephemeral" }withoutttl: "1h". This means only 5-minute prompt cache lifetime instead of the expected 1 hour — a significant cost regression for heavy users. Additionally,cacheRetention: "none"was also silently ignored — the OpenRouter wrapper always injected ephemeral cache markers regardless.The same
cacheRetention: "long"config works correctly when using the direct Anthropic API.Root Cause
Three bugs in sequence:
1.
resolveCacheRetention()— early return ignores explicit configsrc/agents/pi-embedded-runner/prompt-cache-retention.tsThe function checked
resolveAnthropicCacheRetentionFamily()first. Forprovider: "openrouter", this returnsundefined(not a recognized Anthropic/Google cache family). The subsequent early return:exited before ever checking
extraParams.cacheRetention. Even withcacheRetention: "long"explicitly set in config, the function never reached that check.Fix: Move explicit-config checks (
cacheRetention/ legacycacheControlTtl) before the family-based early return, so operator-specified retention always wins.2.
applyAnthropicEphemeralCacheControlMarkers()— hardcoded no-TTL markerssrc/agents/anthropic-payload-policy.tsCalled by
createOpenRouterSystemCacheWrapper, this function always wrote{ type: "ephemeral" }— even if the wrapper had resolved acacheControlwithttl: "1h", the TTL was discarded.Fix: Accept an optional
cacheControl: AnthropicEphemeralCacheControlparameter (defaults to{ type: "ephemeral" }for backward compat). Also exportresolveAnthropicEphemeralCacheControl()so the wrapper can build the proper marker with TTL.3.
cacheRetention: "none"still injected markerssrc/agents/pi-embedded-runner/proxy-stream-wrappers.tsWhen
cacheRetentionis"none",resolveAnthropicEphemeralCacheControlcorrectly returnsundefined, but the wrapper'scacheControl ?? { type: "ephemeral" }fallback still injected a short-cache marker — contradicting the operator's explicit opt-out.Fix: Skip marker insertion entirely when
cacheRetention === "none".Before / After
Before (OpenRouter→Anthropic,
cacheRetention: "long"):After (same config):
After (
cacheRetention: "none"): nocache_controlmarkers at all.Review Fix: endpoint-class gate in resolveCacheRetention
Initial patch checked
provider === "openrouter"inresolveCacheRetention(), but this did not account forbaseUrl. If a user repoints the OpenRouter provider at a custom OpenAI-compatible proxy, the system wrapper correctly skips markers via the endpoint-class gate, butresolveCacheRetention()would still passcacheRetentionthrough to pi-ai — leakingcache_controlwith TTL to that proxy path.Fix:
resolveCacheRetention()now accepts an optionalbaseUrlparameter and usesresolveProviderRequestPolicyto determineendpointClass, mirroring the same gate ascreateOpenRouterSystemCacheWrapper. Both call sites inextra-params.tspassmodel.baseUrl/callModel.baseUrl.Regression tests added:
undefined(no leak)openrouter.aibaseUrl → honouredReal Behavior Proof
Behavior or issue addressed:
cacheRetention: "long"config was silently ignored for OpenRouter→Anthropic models. Requests always got 5-minute ephemeral cache markers instead of 1-hour TTL.cacheRetention: "none"was also ignored — markers were always injected.Real environment tested: Self-hosted OpenClaw v2026.5.7 on VMware VM (Linux Mint), OpenRouter provider, model
openrouter/anthropic/claude-haiku-4.5withparams.cacheRetention: "long". Outbound request payloads captured via HTTPS logging proxy.Exact steps or command run after the patch:
v2026.5.7tagdist/in production install and restarted gatewayEvidence after fix:
Observed result after the fix: All 3
cache_controlmarkers in the Anthropic-format payload now include"ttl": "1h". Before the patch (same config, unpatched v2026.5.7), all 3 markers had only{ "type": "ephemeral" }without ttl. The fix restores the expected 1-hour prompt cache lifetime on the Anthropic backend, matching the direct Anthropic API behavior. WhencacheRetention: "none"is set, no cache markers are injected.What was not tested: Cache hit/miss ratio on the Anthropic backend side (only verified outbound
cache_controlmarkers). Did not test withprovider: "anthropic"direct API (already working). Did not testcacheRetention: "none"end-to-end (unit test coverage only).Changes
prompt-cache-retention.tsbaseUrlparam; useresolveProviderRequestPolicyfor endpoint-class gateanthropic-payload-policy.tsresolveAnthropicEphemeralCacheControl; addcacheControlparam to marker functionproxy-stream-wrappers.tsextraParams; resolve + passcacheControlwith TTL; skip markers when"none"extra-params.tseffectiveExtraParamsto wrapper; passbaseUrltoresolveCacheRetentionTests
prompt-cache-retention.test.tslong/short/nonefor OpenRouter Anthropic;undefinedwithout config; endpoint-class baseUrl gating (3 tests)anthropic-cache-control-payload.test.tsttlin customcacheControlproxy-stream-wrappers.test.tsttl: "1h"withlong; no ttl withshort/default; no markers with"none"; custom provider pointing to openrouter.aiRelated