fix(agents): honor explicit cacheRetention for openai-completions providers by lonexreb · Pull Request #82973 · openclaw/openclaw

lonexreb · 2026-05-17T06:49:00Z

Summary

resolveCacheRetention in the embedded runner wrapper silently dropped an explicit user-provided cacheRetention for any provider that was neither in the anthropic family nor google-eligible. That meant openai-completions providers with prefix-caching backends (oMLX, llama.cpp, etc.) — the ones documented to opt in via compat.supportsPromptCacheKey: true — never received the user's chosen retention. The wrapper omitted it, so the transport could not propagate prompt_cache_retention: "24h" for long-retention setups, and the explicit user intent was lost on the way to buildOpenAICompletionsParams.

Now any explicit "none" | "short" | "long" is honored regardless of provider family. Legacy cacheControlTtl ("5m"/"1h") aliasing stays scoped to anthropic/google families where it was previously meaningful — that path was already gated by the family check.

Fixes #81281.

Verification

CI=true pnpm test src/agents/pi-embedded-runner/prompt-cache-retention.test.ts — 6 passed (added regression coverage for openai-completions explicit long/short/none and for the undefined fallback that keeps embedded-wrapper behavior unchanged).
Reviewed adjacent src/agents/openai-transport-stream.test.ts suite (165 passing); the single pre-existing flake on the unrelated yields to aborts during bursty Responses streams test reproduces on origin/main without these changes and is not touched by this PR.

Test plan

Maintainer live-trace a Docker/oMLX openai-completions run with compat.supportsPromptCacheKey: true and cacheRetention: "long" to confirm prompt_cache_key and prompt_cache_retention: "24h" reach the outgoing payload end-to-end.

Real behavior proof

Behavior addressed: resolveCacheRetention in src/agents/pi-embedded-runner/prompt-cache-retention.ts silently dropped an explicit cacheRetention: "long" | "short" | "none" whenever the provider was not in the anthropic family and not google-eligible. openai-completions providers that opt into prompt caching via compat.supportsPromptCacheKey: true therefore lost the user's intent before it reached buildOpenAICompletionsParams, so the outgoing payload never carried prompt_cache_retention: "24h" for long retention (issue #81281).

Real environment tested: branch fix/81281-openai-prompt-cache-key at HEAD cb0503e19e checked out against this OpenClaw working copy, plus a verbatim re-implementation of the origin/main resolver (HEAD aef93881af) to drive a before/after comparison through node directly against the on-branch module.

Exact steps or command run after this patch:

$ node node_modules/.bin/tsx demo-81281-compare.mjs

The demo imports the on-branch resolveCacheRetention and a pure-JS port of the origin/main body (preserving the original if (!family && !googleEligible) return undefined; early return that caused the bug), then exercises the openai-completions opt-in scenario plus regression guards.

Evidence after fix: live node stdout against the on-branch source (HEAD cb0503e19e):

== openai-completions provider, user sets cacheRetention=long (issue #81281) ==
origin/main behavior: {}
this branch behavior: {"resolved":"long"}

== regression check: undefined cacheRetention on openai-completions stays undefined ==
this branch: {}

== regression check: anthropic family unchanged ==
this branch: {"resolved":"long"}

== regression check: legacy cacheControlTtl='1h' on openai-completions no longer maps to 'long' ==
origin/main: {}
this branch: {}

({} is JSON.stringify({ resolved: undefined }) — the resolved key is omitted when the function returns undefined. The first block is the actual fix: a user-set cacheRetention: "long" against an openai-completions provider that previously resolved to undefined now resolves to "long", which is the value buildOpenAICompletionsParams needs in order to emit prompt_cache_retention: "24h" downstream.)

Observed result after fix: an openai-completions provider with compat.supportsPromptCacheKey: true and cacheRetention: "long" now receives the resolved "long" from the embedded-runner wrapper instead of undefined. The downstream transport at src/agents/openai-transport-stream.ts:1717 and :3009 already maps that value onto prompt_cache_retention in the outgoing OpenAI-completions payload (see commit cb0503e19e). Anthropic-family and google-eligible callers still receive the same value they did on origin/main; the legacy cacheControlTtl: "1h" shortcut remains gated to those families so callers without compat.supportsPromptCacheKey are unchanged.

What was not tested: end-to-end traffic against a live oMLX or llama.cpp openai-completions backend with prompt caching enabled. The transport-side payload emission is covered by the existing src/agents/openai-transport-stream.test.ts suite (165 passing on this branch). A maintainer-side oMLX/llama.cpp live trace would be the natural next step and is what the Test plan TODO above flags.

clawsweeper · 2026-05-17T06:49:58Z

Codex review: needs real behavior proof before merge. Reviewed May 27, 2026, 7:49 AM ET / 11:49 UTC.

Summary
Review failed before ClawSweeper could summarize the requested change.

PR surface: Source +28, Tests +204, Docs 0. Total +232 across 7 files.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Review metrics: none identified.

Merge readiness
Overall: 🌊 off-meta tidepool
Proof: 🌊 off-meta tidepool
Patch quality: 🌊 off-meta tidepool
Result: rating does not apply to this item.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

No close action taken because the review did not complete.

Maintainer options:

Decide the mitigation before merge
Retry the Codex review after fixing the execution failure.
Pause or close
Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge
Review did not complete, so no work-lane recommendation was made.

Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

AGENTS.md: unclear because the file could not be read completely.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 513a223c15f6.

Label changes

Label changes:

add rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
remove P2: Current review triage priority is none.
remove rating: 🦐 gold shrimp: Current PR rating is rating: 🌊 off-meta tidepool, so this older rating label is no longer current.
remove merge-risk: 🚨 compatibility: Current PR review selected no merge-risk labels.
remove merge-risk: 🚨 auth-provider: Current PR review selected no merge-risk labels.
remove status: 📣 needs proof: Current PR status no longer selects a status label.

Label justifications:

rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.

Evidence reviewed

PR surface:

Source +28, Tests +204, Docs 0. Total +232 across 7 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	3	31	3	+28
Tests	3	204	0	+204
Docs	1	2	2	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	7	237	5	+232

What I checked:

failure reason: timeout.
codex failure detail: Codex review failed for this PR: spawnSync codex ETIMEDOUT.
codex stdout: Per-item Codex failure; continuing with the rest of the shard.

Likely related people:

unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

…tions payload Codex P2 follow-up on PR openclaw#82973 / issue openclaw#81281. buildOpenAICompletionsParams previously only set prompt_cache_key when supportsPromptCacheKey was true and silently dropped the resolved cacheRetention=long preference. The wire payload reaching oMLX, llama.cpp, and other OpenAI-compatible completions backends therefore never carried the canonical 24h prompt_cache_retention value, even though the resolver returned long. Forward prompt_cache_retention=24h alongside prompt_cache_key whenever the caller opts into long retention on a compat-flagged backend. Short and unset retention continue to omit the field.

clawsweeper · 2026-05-27T06:20:58Z

ClawSweeper PR egg

🥚 Incubating: this PR egg is tucked into the review nest.

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

Merged PRs are hatchable.
Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

What is this egg doing here?

Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

Wrapper `resolveCacheRetention` returned undefined for non-anthropic-family, non-google providers, silently dropping the user's explicit `cacheRetention` value before it reached the openai-completions transport. This affected prefix-caching backends (oMLX, llama.cpp, etc.) that opt in via `compat.supportsPromptCacheKey: true`. Now honors any explicit "none" | "short" | "long" regardless of family while keeping legacy `cacheControlTtl` aliasing scoped to anthropic/google families (which is where it was previously meaningful). Fixes openclaw#81281.

…tions payload Codex P2 follow-up on PR openclaw#82973 / issue openclaw#81281. buildOpenAICompletionsParams previously only set prompt_cache_key when supportsPromptCacheKey was true and silently dropped the resolved cacheRetention=long preference. The wire payload reaching oMLX, llama.cpp, and other OpenAI-compatible completions backends therefore never carried the canonical 24h prompt_cache_retention value, even though the resolver returned long. Forward prompt_cache_retention=24h alongside prompt_cache_key whenever the caller opts into long retention on a compat-flagged backend. Short and unset retention continue to omit the field.

…or non-family providers Issue openclaw#82974 / regression in 81281 fix: the original removed the '!family && !googleEligible' early-return entirely so openai-completions backends with compat.supportsPromptCacheKey: true (oMLX, llama.cpp) could pass user-set cacheRetention through to the transport. But that also let providers proxying non-cacheable models via openai-completions (amazon-bedrock + amazon.* nova) leak the explicit value into payloads the backend cannot honor. Restore the family/google gate but extend it with a new supportsPromptCacheKey parameter. Callers in extra-params.ts read the flag from the model's compat.supportsPromptCacheKey === true. Without that flag, non-family/non-google providers still drop explicit cacheRetention as before — preserving the new regression test on main ('does not treat non-Anthropic Bedrock models as cache-retention eligible') while keeping the openclaw#81281 fix in place for actual prefix-caching backends. Also drop two unnecessary 'as Record<string, unknown>' casts in openai-transport-stream.test.ts that oxlint flagged as redundant in typescript-eslint/no-unnecessary-type-assertion.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9cedb4a818

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-27T11:35:54Z

  provider: string,
  modelApi?: string,
  modelId?: string,
+  supportsPromptCacheKey?: boolean,


Pass cache-key eligibility to attempt-level retention

When a cache-key OpenAI-compatible completions model sets cacheRetention: "long", the stream wrapper now resolves and forwards long, but the attempt-level resolver call in src/agents/pi-embedded-runner/run/attempt.ts:3020 still omits this new supportsPromptCacheKey argument. In that scenario effectivePromptCacheRetention remains undefined, so the prompt-cache runtime context and cache observability built from it omit the 24h retention even though the outgoing payload uses it; cache-aware context engines can then make decisions with stale/missing TTL information. Please plumb the model compat flag through the attempt-level call as well.

Useful? React with 👍 / 👎.

Carry over #82973 and fix #81281 by preserving explicit cacheRetention for OpenAI-compatible completions providers that opt into prompt-cache-key support. The change keeps explicit cacheRetention suppressed for OpenAI-compatible providers without compat.supportsPromptCacheKey, adds regression coverage for both paths, and updates prompt-caching docs for prompt_cache_key / prompt_cache_retention behavior. Fixes #81281. Supersedes #82973. Co-authored-by: lonexreb <reach2shubhankar@gmail.com>

steipete · 2026-05-27T12:21:47Z

Thanks again @lonexreb. I carried this over and landed it through #87274 because this fork PR stopped syncing after maintainer fixups: the fork branch advanced, but GitHub kept refs/pull/82973/head pinned to the earlier red SHA.

Landed as 3e351b7 with your original commits credited via co-author. The landed change includes the OpenAI-compatible cache-retention fix, additional regression coverage for the compat.supportsPromptCacheKey and non-opt-in paths, and the prompt-caching docs update.

Future PRs should be able to follow the normal maintainer-fixup path; this one just hit a GitHub PR-ref sync failure after the branch update.

Carry over openclaw#82973 and fix openclaw#81281 by preserving explicit cacheRetention for OpenAI-compatible completions providers that opt into prompt-cache-key support. The change keeps explicit cacheRetention suppressed for OpenAI-compatible providers without compat.supportsPromptCacheKey, adds regression coverage for both paths, and updates prompt-caching docs for prompt_cache_key / prompt_cache_retention behavior. Fixes openclaw#81281. Supersedes openclaw#82973. Co-authored-by: lonexreb <reach2shubhankar@gmail.com>

openclaw-barnacle Bot added agents Agent runtime and tooling size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 17, 2026

clawsweeper Bot added P2 Normal backlog priority with limited blast radius. impact:auth-provider Auth, provider routing, model choice, or SecretRef resolution may break. labels May 17, 2026

clawsweeper Bot mentioned this pull request May 18, 2026

fix: pass through explicit OpenAI cache retention #46693

Closed

openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 19, 2026

clawsweeper Bot mentioned this pull request May 21, 2026

[Bug]: cacheRetention ignored for LiteLLM-proxied Anthropic models #37966

Open

lonexreb force-pushed the fix/81281-openai-prompt-cache-key branch from cb0503e to c479536 Compare May 27, 2026 06:14

clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels May 27, 2026

openclaw-barnacle Bot added docs Improvements or additions to documentation size: M and removed size: S labels May 27, 2026

lonexreb and others added 4 commits May 27, 2026 12:32

test: cover OpenAI-compatible cache retention

9cedb4a

steipete force-pushed the fix/81281-openai-prompt-cache-key branch from ccde6e6 to 9cedb4a Compare May 27, 2026 11:33

chatgpt-codex-connector Bot reviewed May 27, 2026

View reviewed changes

steipete mentioned this pull request May 27, 2026

fix(agents): honor OpenAI-compatible cache retention #87274

Merged

steipete closed this May 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agents): honor explicit cacheRetention for openai-completions providers#82973

fix(agents): honor explicit cacheRetention for openai-completions providers#82973
lonexreb wants to merge 4 commits into
openclaw:mainfrom
lonexreb:fix/81281-openai-prompt-cache-key

lonexreb commented May 17, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 17, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 27, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 27, 2026

Uh oh!

steipete commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

lonexreb commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Test plan

Real behavior proof

Uh oh!

clawsweeper Bot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Hatch command

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

steipete commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lonexreb commented May 17, 2026 •

edited

Loading

clawsweeper Bot commented May 17, 2026 •

edited

Loading

clawsweeper Bot commented May 27, 2026 •

edited

Loading