Skip to content

fix(agents): honor OpenAI-compatible cache retention#87274

Merged
steipete merged 5 commits into
mainfrom
maint/pr-82973-cache-retention
May 27, 2026
Merged

fix(agents): honor OpenAI-compatible cache retention#87274
steipete merged 5 commits into
mainfrom
maint/pr-82973-cache-retention

Conversation

@steipete

Copy link
Copy Markdown
Contributor

Summary

  • Carries over fix(agents): honor explicit cacheRetention for openai-completions providers #82973 with the cache-retention fix for OpenAI-compatible completions providers.
  • Adds regression coverage for explicit cacheRetention passthrough only when compat.supportsPromptCacheKey is set, plus the negative path when it is not.
  • Updates prompt-caching docs for OpenAI-compatible completions providers that opt into prompt_cache_key / prompt_cache_retention.

Fixes #81281.
Supersedes #82973 because GitHub stopped syncing the updated fork branch after maintainer fixups; the fork ref is at the fixed SHA, but the PR ref stayed pinned to the prior red SHA.

Verification

  • pnpm test src/agents/pi-embedded-runner/prompt-cache-retention.test.ts src/agents/pi-embedded-runner-extraparams.test.ts src/agents/openai-transport-stream.test.ts -- --reporter=verbose
  • pnpm check:test-types
  • pnpm check:docs
  • git diff --check
  • /Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode local

Real behavior proof

Behavior addressed: OpenAI-compatible completions providers with compat.supportsPromptCacheKey: true now keep explicit cacheRetention: "long" through extra params and emit prompt_cache_retention: "24h" with prompt_cache_key.
Real environment tested: local OpenClaw source checkout on Node/pnpm plus GitHub CI on the original PR SHA before the fork PR ref stopped syncing.
Exact steps or command run after this patch: focused Vitest command above, pnpm check:test-types, pnpm check:docs, git diff --check, and local autoreview.
Evidence after fix: focused Vitest passed 327 tests; check:test-types passed; docs formatting, markdownlint, MDX, i18n glossary, and link audit passed with broken_links=0; autoreview reported no actionable findings.
Observed result after fix: regression tests prove explicit cache retention reaches prompt-cache-key OpenAI-compatible completions providers and stays suppressed for providers without that compat flag.
What was not tested: live oMLX/llama.cpp backend payload trace.

Co-authored-by: lonexreb reach2shubhankar@gmail.com

lonexreb and others added 5 commits May 27, 2026 12:54
Wrapper `resolveCacheRetention` returned undefined for non-anthropic-family,
non-google providers, silently dropping the user's explicit `cacheRetention`
value before it reached the openai-completions transport. This affected
prefix-caching backends (oMLX, llama.cpp, etc.) that opt in via
`compat.supportsPromptCacheKey: true`.

Now honors any explicit "none" | "short" | "long" regardless of family while
keeping legacy `cacheControlTtl` aliasing scoped to anthropic/google families
(which is where it was previously meaningful).

Fixes #81281.
…tions payload

Codex P2 follow-up on PR #82973 / issue #81281. buildOpenAICompletionsParams
previously only set prompt_cache_key when supportsPromptCacheKey was true and
silently dropped the resolved cacheRetention=long preference. The wire payload
reaching oMLX, llama.cpp, and other OpenAI-compatible completions backends
therefore never carried the canonical 24h prompt_cache_retention value, even
though the resolver returned long.

Forward prompt_cache_retention=24h alongside prompt_cache_key whenever the
caller opts into long retention on a compat-flagged backend. Short and unset
retention continue to omit the field.
…or non-family providers

Issue #82974 / regression in 81281 fix: the original removed the
'!family && !googleEligible' early-return entirely so openai-completions
backends with compat.supportsPromptCacheKey: true (oMLX, llama.cpp)
could pass user-set cacheRetention through to the transport. But that
also let providers proxying non-cacheable models via openai-completions
(amazon-bedrock + amazon.* nova) leak the explicit value into payloads
the backend cannot honor.

Restore the family/google gate but extend it with a new
supportsPromptCacheKey parameter. Callers in extra-params.ts read the
flag from the model's compat.supportsPromptCacheKey === true. Without
that flag, non-family/non-google providers still drop explicit
cacheRetention as before — preserving the new regression test on main
('does not treat non-Anthropic Bedrock models as cache-retention
eligible') while keeping the #81281 fix in place for actual
prefix-caching backends.

Also drop two unnecessary 'as Record<string, unknown>' casts in
openai-transport-stream.test.ts that oxlint flagged as redundant in
typescript-eslint/no-unnecessary-type-assertion.
@openclaw-barnacle openclaw-barnacle Bot added size: M maintainer Maintainer-authored PR labels May 27, 2026
@clawsweeper

clawsweeper Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper status: review started.

I am starting a fresh review of this pull request: fix(agents): honor OpenAI-compatible cache retention This is item 1/1 in the current shard. Shard 0/1.

This placeholder means the worker is alive and reading the current context. I will edit this same comment with the actual review when the claws are done clicking.

Crustacean status: shell secured, claws on keyboard, evidence pebbles being sorted.

@steipete

Copy link
Copy Markdown
Contributor Author

Verification update before merge:

  • Local: pnpm test src/agents/pi-embedded-runner/prompt-cache-retention.test.ts src/agents/pi-embedded-runner-extraparams.test.ts src/agents/openai-transport-stream.test.ts -- --reporter=verbose passed 327 focused tests.
  • Local: pnpm check:test-types passed.
  • Local: pnpm check:docs passed with broken_links=0.
  • Local: git diff --check passed.
  • Local autoreview: /Users/steipete/Projects/agent-scripts/skills/autoreview/scripts/autoreview --mode local reported no actionable findings.
  • CI: 26509714116 passed after rerunning checkout-timeout failures.
  • Dependency awareness: 26509712784 passed after GitHub's temporary diff 500 cleared.
  • Other relevant checks: CodeQL Critical Quality 26509714153, Workflow Sanity 26509714117, Sandbox Common Smoke 26509714155, Website Installer Sync 26509714054, OpenGrep PR Diff 26509714303, CodeQL 26509714051, and Blacksmith Build Artifacts Testbox 26509714156 passed.

Known gap: no live oMLX/llama.cpp backend payload trace; covered by transport/extra-param regression tests and docs proof.

@steipete steipete merged commit 3e351b7 into main May 27, 2026
192 of 248 checks passed
@steipete steipete deleted the maint/pr-82973-cache-retention branch May 27, 2026 12:21
@felipebridge

Copy link
Copy Markdown

Really solid compatibility fix. I especially like that this doesn’t just blindly pass through cacheRetention, but properly gates the behavior behind compat.supportsPromptCacheKey. That keeps the transport layer clean and avoids leaking provider-specific semantics into backends that don’t actually support them.

The regression coverage is also very well thought out — testing both the positive and negative paths here is huge for long-term stability, especially in multi-provider systems where adapter behavior tends to drift over time.

Also appreciate that the docs were updated alongside the runtime behavior. Keeping compatibility docs aligned with actual transport semantics saves a surprising amount of future debugging pain.

Overall this feels like a very clean restoration of expected behavior rather than a narrow patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintainer Maintainer-authored PR size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: OpenAI-completions prompt_cache_key regression — caching worked in 2026.3.x, broken in 2026.5.x

3 participants