Skip to content

fix(agents): honor explicit cacheRetention for openai-completions providers#82973

Closed
lonexreb wants to merge 4 commits into
openclaw:mainfrom
lonexreb:fix/81281-openai-prompt-cache-key
Closed

fix(agents): honor explicit cacheRetention for openai-completions providers#82973
lonexreb wants to merge 4 commits into
openclaw:mainfrom
lonexreb:fix/81281-openai-prompt-cache-key

Conversation

@lonexreb

@lonexreb lonexreb commented May 17, 2026

Copy link
Copy Markdown
Contributor

Summary

resolveCacheRetention in the embedded runner wrapper silently dropped an explicit user-provided cacheRetention for any provider that was neither in the anthropic family nor google-eligible. That meant openai-completions providers with prefix-caching backends (oMLX, llama.cpp, etc.) — the ones documented to opt in via compat.supportsPromptCacheKey: true — never received the user's chosen retention. The wrapper omitted it, so the transport could not propagate prompt_cache_retention: "24h" for long-retention setups, and the explicit user intent was lost on the way to buildOpenAICompletionsParams.

Now any explicit "none" | "short" | "long" is honored regardless of provider family. Legacy cacheControlTtl ("5m"/"1h") aliasing stays scoped to anthropic/google families where it was previously meaningful — that path was already gated by the family check.

Fixes #81281.

Verification

  • CI=true pnpm test src/agents/pi-embedded-runner/prompt-cache-retention.test.ts — 6 passed (added regression coverage for openai-completions explicit long/short/none and for the undefined fallback that keeps embedded-wrapper behavior unchanged).
  • Reviewed adjacent src/agents/openai-transport-stream.test.ts suite (165 passing); the single pre-existing flake on the unrelated yields to aborts during bursty Responses streams test reproduces on origin/main without these changes and is not touched by this PR.

Test plan

  • Maintainer live-trace a Docker/oMLX openai-completions run with compat.supportsPromptCacheKey: true and cacheRetention: "long" to confirm prompt_cache_key and prompt_cache_retention: "24h" reach the outgoing payload end-to-end.

Real behavior proof

Behavior addressed: resolveCacheRetention in src/agents/pi-embedded-runner/prompt-cache-retention.ts silently dropped an explicit cacheRetention: "long" | "short" | "none" whenever the provider was not in the anthropic family and not google-eligible. openai-completions providers that opt into prompt caching via compat.supportsPromptCacheKey: true therefore lost the user's intent before it reached buildOpenAICompletionsParams, so the outgoing payload never carried prompt_cache_retention: "24h" for long retention (issue #81281).

Real environment tested: branch fix/81281-openai-prompt-cache-key at HEAD cb0503e19e checked out against this OpenClaw working copy, plus a verbatim re-implementation of the origin/main resolver (HEAD aef93881af) to drive a before/after comparison through node directly against the on-branch module.

Exact steps or command run after this patch:

$ node node_modules/.bin/tsx demo-81281-compare.mjs

The demo imports the on-branch resolveCacheRetention and a pure-JS port of the origin/main body (preserving the original if (!family && !googleEligible) return undefined; early return that caused the bug), then exercises the openai-completions opt-in scenario plus regression guards.

Evidence after fix: live node stdout against the on-branch source (HEAD cb0503e19e):

== openai-completions provider, user sets cacheRetention=long (issue #81281) ==
origin/main behavior: {}
this branch behavior: {"resolved":"long"}

== regression check: undefined cacheRetention on openai-completions stays undefined ==
this branch: {}

== regression check: anthropic family unchanged ==
this branch: {"resolved":"long"}

== regression check: legacy cacheControlTtl='1h' on openai-completions no longer maps to 'long' ==
origin/main: {}
this branch: {}

({} is JSON.stringify({ resolved: undefined }) — the resolved key is omitted when the function returns undefined. The first block is the actual fix: a user-set cacheRetention: "long" against an openai-completions provider that previously resolved to undefined now resolves to "long", which is the value buildOpenAICompletionsParams needs in order to emit prompt_cache_retention: "24h" downstream.)

Observed result after fix: an openai-completions provider with compat.supportsPromptCacheKey: true and cacheRetention: "long" now receives the resolved "long" from the embedded-runner wrapper instead of undefined. The downstream transport at src/agents/openai-transport-stream.ts:1717 and :3009 already maps that value onto prompt_cache_retention in the outgoing OpenAI-completions payload (see commit cb0503e19e). Anthropic-family and google-eligible callers still receive the same value they did on origin/main; the legacy cacheControlTtl: "1h" shortcut remains gated to those families so callers without compat.supportsPromptCacheKey are unchanged.

What was not tested: end-to-end traffic against a live oMLX or llama.cpp openai-completions backend with prompt caching enabled. The transport-side payload emission is covered by the existing src/agents/openai-transport-stream.test.ts suite (165 passing on this branch). A maintainer-side oMLX/llama.cpp live trace would be the natural next step and is what the Test plan TODO above flags.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 17, 2026
@clawsweeper

clawsweeper Bot commented May 17, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed May 27, 2026, 7:49 AM ET / 11:49 UTC.

Summary
Review failed before ClawSweeper could summarize the requested change.

PR surface: Source +28, Tests +204, Docs 0. Total +232 across 7 files.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Review metrics: none identified.

Merge readiness
Overall: 🌊 off-meta tidepool
Proof: 🌊 off-meta tidepool
Patch quality: 🌊 off-meta tidepool
Result: rating does not apply to this item.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • No close action taken because the review did not complete.

Maintainer options:

  1. Decide the mitigation before merge
    Retry the Codex review after fixing the execution failure.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge
Review did not complete, so no work-lane recommendation was made.

Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

AGENTS.md: unclear because the file could not be read completely.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 513a223c15f6.

Label changes

Label changes:

  • add rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
  • remove P2: Current review triage priority is none.
  • remove rating: 🦐 gold shrimp: Current PR rating is rating: 🌊 off-meta tidepool, so this older rating label is no longer current.
  • remove merge-risk: 🚨 compatibility: Current PR review selected no merge-risk labels.
  • remove merge-risk: 🚨 auth-provider: Current PR review selected no merge-risk labels.
  • remove status: 📣 needs proof: Current PR status no longer selects a status label.

Label justifications:

  • rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
Evidence reviewed

PR surface:

Source +28, Tests +204, Docs 0. Total +232 across 7 files.

View PR surface stats
Area Files Added Removed Net
Source 3 31 3 +28
Tests 3 204 0 +204
Docs 1 2 2 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 7 237 5 +232

What I checked:

  • failure reason: timeout.
  • codex failure detail: Codex review failed for this PR: spawnSync codex ETIMEDOUT.
  • codex stdout: Per-item Codex failure; continuing with the rest of the shard.

Likely related people:

  • unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

lonexreb added a commit to lonexreb/paloa-claw that referenced this pull request May 17, 2026
…tions payload

Codex P2 follow-up on PR openclaw#82973 / issue openclaw#81281. buildOpenAICompletionsParams
previously only set prompt_cache_key when supportsPromptCacheKey was true and
silently dropped the resolved cacheRetention=long preference. The wire payload
reaching oMLX, llama.cpp, and other OpenAI-compatible completions backends
therefore never carried the canonical 24h prompt_cache_retention value, even
though the resolver returned long.

Forward prompt_cache_retention=24h alongside prompt_cache_key whenever the
caller opts into long retention on a compat-flagged backend. Short and unset
retention continue to omit the field.
@clawsweeper clawsweeper Bot added P2 Normal backlog priority with limited blast radius. impact:auth-provider Auth, provider routing, model choice, or SecretRef resolution may break. labels May 17, 2026
@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 19, 2026
@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. and removed impact:auth-provider Auth, provider routing, model choice, or SecretRef resolution may break. labels May 19, 2026
lonexreb added a commit to lonexreb/paloa-claw that referenced this pull request May 27, 2026
…tions payload

Codex P2 follow-up on PR openclaw#82973 / issue openclaw#81281. buildOpenAICompletionsParams
previously only set prompt_cache_key when supportsPromptCacheKey was true and
silently dropped the resolved cacheRetention=long preference. The wire payload
reaching oMLX, llama.cpp, and other OpenAI-compatible completions backends
therefore never carried the canonical 24h prompt_cache_retention value, even
though the resolver returned long.

Forward prompt_cache_retention=24h alongside prompt_cache_key whenever the
caller opts into long retention on a compat-flagged backend. Short and unset
retention continue to omit the field.
@lonexreb lonexreb force-pushed the fix/81281-openai-prompt-cache-key branch from cb0503e to c479536 Compare May 27, 2026 06:14
@clawsweeper

clawsweeper Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🥚 Incubating: this PR egg is tucked into the review nest.

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.
What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@clawsweeper clawsweeper Bot added rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. labels May 27, 2026
@openclaw-barnacle openclaw-barnacle Bot added docs Improvements or additions to documentation size: M and removed size: S labels May 27, 2026
lonexreb and others added 4 commits May 27, 2026 12:32
Wrapper `resolveCacheRetention` returned undefined for non-anthropic-family,
non-google providers, silently dropping the user's explicit `cacheRetention`
value before it reached the openai-completions transport. This affected
prefix-caching backends (oMLX, llama.cpp, etc.) that opt in via
`compat.supportsPromptCacheKey: true`.

Now honors any explicit "none" | "short" | "long" regardless of family while
keeping legacy `cacheControlTtl` aliasing scoped to anthropic/google families
(which is where it was previously meaningful).

Fixes openclaw#81281.
…tions payload

Codex P2 follow-up on PR openclaw#82973 / issue openclaw#81281. buildOpenAICompletionsParams
previously only set prompt_cache_key when supportsPromptCacheKey was true and
silently dropped the resolved cacheRetention=long preference. The wire payload
reaching oMLX, llama.cpp, and other OpenAI-compatible completions backends
therefore never carried the canonical 24h prompt_cache_retention value, even
though the resolver returned long.

Forward prompt_cache_retention=24h alongside prompt_cache_key whenever the
caller opts into long retention on a compat-flagged backend. Short and unset
retention continue to omit the field.
…or non-family providers

Issue openclaw#82974 / regression in 81281 fix: the original removed the
'!family && !googleEligible' early-return entirely so openai-completions
backends with compat.supportsPromptCacheKey: true (oMLX, llama.cpp)
could pass user-set cacheRetention through to the transport. But that
also let providers proxying non-cacheable models via openai-completions
(amazon-bedrock + amazon.* nova) leak the explicit value into payloads
the backend cannot honor.

Restore the family/google gate but extend it with a new
supportsPromptCacheKey parameter. Callers in extra-params.ts read the
flag from the model's compat.supportsPromptCacheKey === true. Without
that flag, non-family/non-google providers still drop explicit
cacheRetention as before — preserving the new regression test on main
('does not treat non-Anthropic Bedrock models as cache-retention
eligible') while keeping the openclaw#81281 fix in place for actual
prefix-caching backends.

Also drop two unnecessary 'as Record<string, unknown>' casts in
openai-transport-stream.test.ts that oxlint flagged as redundant in
typescript-eslint/no-unnecessary-type-assertion.
@steipete steipete force-pushed the fix/81281-openai-prompt-cache-key branch from ccde6e6 to 9cedb4a Compare May 27, 2026 11:33

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9cedb4a818

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

provider: string,
modelApi?: string,
modelId?: string,
supportsPromptCacheKey?: boolean,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pass cache-key eligibility to attempt-level retention

When a cache-key OpenAI-compatible completions model sets cacheRetention: "long", the stream wrapper now resolves and forwards long, but the attempt-level resolver call in src/agents/pi-embedded-runner/run/attempt.ts:3020 still omits this new supportsPromptCacheKey argument. In that scenario effectivePromptCacheRetention remains undefined, so the prompt-cache runtime context and cache observability built from it omit the 24h retention even though the outgoing payload uses it; cache-aware context engines can then make decisions with stale/missing TTL information. Please plumb the model compat flag through the attempt-level call as well.

Useful? React with 👍 / 👎.

@clawsweeper clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 27, 2026
steipete added a commit that referenced this pull request May 27, 2026
Carry over #82973 and fix #81281 by preserving explicit cacheRetention for OpenAI-compatible completions providers that opt into prompt-cache-key support.

The change keeps explicit cacheRetention suppressed for OpenAI-compatible providers without compat.supportsPromptCacheKey, adds regression coverage for both paths, and updates prompt-caching docs for prompt_cache_key / prompt_cache_retention behavior.

Fixes #81281.
Supersedes #82973.

Co-authored-by: lonexreb <reach2shubhankar@gmail.com>
@steipete

Copy link
Copy Markdown
Contributor

Thanks again @lonexreb. I carried this over and landed it through #87274 because this fork PR stopped syncing after maintainer fixups: the fork branch advanced, but GitHub kept refs/pull/82973/head pinned to the earlier red SHA.

Landed as 3e351b7 with your original commits credited via co-author. The landed change includes the OpenAI-compatible cache-retention fix, additional regression coverage for the compat.supportsPromptCacheKey and non-opt-in paths, and the prompt-caching docs update.

Future PRs should be able to follow the normal maintainer-fixup path; this one just hit a GitHub PR-ref sync failure after the branch update.

@steipete steipete closed this May 27, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 28, 2026
Carry over openclaw#82973 and fix openclaw#81281 by preserving explicit cacheRetention for OpenAI-compatible completions providers that opt into prompt-cache-key support.

The change keeps explicit cacheRetention suppressed for OpenAI-compatible providers without compat.supportsPromptCacheKey, adds regression coverage for both paths, and updates prompt-caching docs for prompt_cache_key / prompt_cache_retention behavior.

Fixes openclaw#81281.
Supersedes openclaw#82973.

Co-authored-by: lonexreb <reach2shubhankar@gmail.com>
SYU8384 pushed a commit to SYU8384/openclaw that referenced this pull request Jun 3, 2026
Carry over openclaw#82973 and fix openclaw#81281 by preserving explicit cacheRetention for OpenAI-compatible completions providers that opt into prompt-cache-key support.

The change keeps explicit cacheRetention suppressed for OpenAI-compatible providers without compat.supportsPromptCacheKey, adds regression coverage for both paths, and updates prompt-caching docs for prompt_cache_key / prompt_cache_retention behavior.

Fixes openclaw#81281.
Supersedes openclaw#82973.

Co-authored-by: lonexreb <reach2shubhankar@gmail.com>
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
Carry over openclaw#82973 and fix openclaw#81281 by preserving explicit cacheRetention for OpenAI-compatible completions providers that opt into prompt-cache-key support.

The change keeps explicit cacheRetention suppressed for OpenAI-compatible providers without compat.supportsPromptCacheKey, adds regression coverage for both paths, and updates prompt-caching docs for prompt_cache_key / prompt_cache_retention behavior.

Fixes openclaw#81281.
Supersedes openclaw#82973.

Co-authored-by: lonexreb <reach2shubhankar@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling docs Improvements or additions to documentation merge-risk: 🚨 auth-provider 🚨 May break OAuth, tokens, provider routing, model choice, or credentials. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. P2 Normal backlog priority with limited blast radius. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: OpenAI-completions prompt_cache_key regression — caching worked in 2026.3.x, broken in 2026.5.x

2 participants