fix(agents): add compat.supportsPromptCacheKey to opt out of prompt_cache_key strip#67427
Conversation
Greptile SummaryThis PR adds Confidence Score: 5/5Safe to merge; the logic is correct, default behavior is preserved, and the fix is backed by a live A/B and three new unit tests. Only P2 findings remain. The sole issue is a type-contract gap in the private ResolveProviderRequestPolicyConfigParams that does not affect runtime behavior or TypeScript compilation under existing call patterns. The three-state ternary in resolveProviderRequestCapabilities is correct, schema and Zod are aligned, and the existing 27-test suite continues to pass with the 3 new cases added. src/agents/provider-request-config.ts — ResolveProviderRequestPolicyConfigParams.compat should include supportsPromptCacheKey to keep the type contract accurate.
|
eab80e2 to
cf93c0a
Compare
|
Thanks for the catch @greptile-apps — addressed in cf93c0a7. |
|
Confirmed — Tip: You can customize Greptile's behavior for this repo with |
cf93c0a to
4e71de2
Compare
8be8646 to
3129819
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3129819cfa
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| expect(dependencyPaths).toContain( | ||
| path.join(repoRoot, "node_modules", "@lit-labs/signals", "package.json"), | ||
| path.join(repoRoot, "ui", "node_modules", "@lit/context", "package.json"), | ||
| ); |
There was a problem hiding this comment.
Accept repo-level node_modules fallbacks in bundle hash test
getResolvedBundleDependencyPackageJsonPaths explicitly falls back from ui/node_modules to repo-root node_modules when a UI-local manifest is missing (scripts/bundle-a2ui.mjs candidate order + existsSync check), but this assertion now hard-requires the UI-local path. In mixed/hoisted installs (including this workspace, where @lit/context resolves from root), the function is correct and the test fails, causing avoidable red test lanes for valid dependency layouts.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1f2327e811
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ); | ||
|
|
||
| expect( | ||
| relativeDependencyPaths.map((relativePath) => relativePath.replace(/^ui\//u, "")), |
There was a problem hiding this comment.
Normalize path separators in bundle dependency assertions
This assertion pipeline assumes POSIX separators (replace(/^ui\//...) and the following node_modules/ checks), but path.relative returns backslash-separated paths on Windows. When the test runs there, values like ui\\node_modules\\lit\\package.json are not normalized, so the test fails even if getResolvedBundleDependencyPackageJsonPaths is behaving correctly. Normalize to a consistent separator before applying these regex/equality checks.
Useful? React with 👍 / 👎.
1f2327e to
00677a0
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 00677a0ef7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ); | ||
|
|
||
| expect( | ||
| relativeDependencyPaths.map((relativePath) => relativePath.replace(/^ui\//u, "")), |
There was a problem hiding this comment.
Normalize dependency paths before stripping ui prefix
path.relative yields backslash-separated paths on Windows, so relativePath.replace(/^ui\//u, "") does not remove a ui\\ prefix. In that environment this assertion compares values like ui\\node_modules\\lit\\package.json against node_modules\\lit\\package.json, causing the test to fail even when getResolvedBundleDependencyPackageJsonPaths returns correct manifests. Normalize separators (or use path-segment logic) before applying this prefix strip.
Useful? React with 👍 / 👎.
…ache_key strip The strip introduced in openclaw#49877 (fixes openclaw#48155 — Volcano Engine DeepSeek rejects prompt_cache_key with HTTP 400) applies to every non-native baseUrl via the usesExplicitProxyLikeEndpoint gate. This catches OpenAI- compatible proxies that DO forward prompt_cache_key correctly — CLIProxy, LiteLLM, vLLM, etc. — and disables prompt caching for agents routed through them. Empirical impact on a CLIProxy-routed agent with a ~40K-token prefix: cached_tokens stays at 0 across consecutive turns on the same session, despite a stable prefix and a stable session-derived prompt_cache_key being constructed upstream in buildOpenAIResponsesParams. Add compat.supportsPromptCacheKey to let operators override the default: - true -> never strip (endpoint is known-compatible, e.g. CLIProxy) - false -> always strip for openai-responses APIs, even on native endpoints (endpoint is known-incompatible) - undefined (default) -> existing behavior preserved: strip only on proxy-like endpoints, so the Volcano Engine fix from openclaw#48155 continues to work without configuration. Validated end-to-end on the maintainer fleet via A/B test on a running agent: before the opt-out, cacheRead=0 across 3 consecutive turns (input_tokens 39856 / 39896 / 39937). After setting compat.supportsPromptCacheKey=true, cacheRead=32256 / 39808 / 39936 and visible input_tokens drops to 7723 / 213 / 127 (99.7% prefix cache hit by turn 3). Adds 3 unit tests to provider-attribution.test.ts covering the override, the forced-strip path, and the preserved default.
00677a0 to
182b18b
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 182b18bb05
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| input.compat?.supportsPromptCacheKey === true | ||
| ? false | ||
| : input.compat?.supportsPromptCacheKey === false | ||
| ? api !== undefined && OPENAI_RESPONSES_APIS.has(api) |
There was a problem hiding this comment.
Apply prompt-cache override to Codex responses API
compat.supportsPromptCacheKey === false currently forces stripping only when api is in OPENAI_RESPONSES_APIS, but that set excludes openai-codex-responses, so this new override is silently ignored for Codex responses models. In environments routing Codex responses through endpoints that reject prompt_cache_key, users can set the new config flag and still get the same 400-path behavior because shouldStripResponsesPromptCache remains false.
Useful? React with 👍 / 👎.
Summary
prompt_cache_keywith HTTP 400) applies to every non-native baseUrl via theusesExplicitProxyLikeEndpointgate, including OpenAI-compatible proxies that DO forward the field correctly (CLIProxy, LiteLLM, vLLM). Agents routed through those proxies get zero prompt caching as a side effect.cached_tokens: 0on every turn even with stable sessions and large repeating prefixes. On a ~40K-token prefix, that's ~40K uncached input tokens re-sent on each turn — wasted latency and wasted spend (or quota pressure on subscription-backed routes like Codex Plus).compat.supportsPromptCacheKey?: booleantoModelCompatConfigand extendresolveProviderRequestCapabilitiesto let operators override the strip per model.undefinedpreserves the existing default so [Bug]: prompt_cache_key not supported by Volcano Engine DeepSeek #48155 continues to work without any config change.trueopts out of the strip for proxies known to be compatible.falseforces the strip for endpoints known to reject the field, even on native baseUrls.cacheRetentionis in a different code path and unaffected; thecreateOpenAIResponsesContextManagementWrapperinopenai-stream-wrappers.tsis not modified — the new flag plugs into the already-computedshouldStripResponsesPromptCachecapability.Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause (if applicable)
api ∈ OPENAI_RESPONSES_APIS && policy.usesExplicitProxyLikeEndpoint) is an upper bound on "endpoints that might reject the field", not a proxy allowlist. It catches a much wider set of proxies than the specific ones known to be incompatible. A per-model opt-out is the minimum mechanism to let operators re-enable caching for known-good proxies without changing the conservative default.supportsStore,supportsDeveloperRole, etc.) but none forprompt_cache_key.cached_tokensclosely.Regression Test Plan (if applicable)
src/agents/provider-attribution.test.tsshouldStripResponsesPromptCacherespects the newcompat.supportsPromptCacheKeyoverride (true,false,undefined) independently of endpoint class.resolveProviderRequestCapabilitiesis the single computation site for the strip flag, and the existing tests in this file already cover the endpoint-class cases that feed it. Extending that file keeps the test surface tight and adjacent to the code change.shouldStripResponsesPromptCache: trueon acustom-proxyendpoint) only cover the default-strip path, not the override. The newit("respects compat.supportsPromptCacheKey override on prompt cache stripping", ...)block covers all three override states and the preserved default.User-visible / Behavior Changes
agents.defaults.models.<provider>/<model>.compat.supportsPromptCacheKey?: boolean.prompt_cache_keycorrectly can setsupportsPromptCacheKey: trueon their model compat block and recover prompt caching (non-zerocached_tokensin usage responses).supportsPromptCacheKey: falseto force the strip even on native baseUrls.Diagram (if applicable)
Security Impact (required)
NoNoNo(adds a conditional pass-through of an existing OpenAI Responses field; no new outbound requests, no new auth paths)NoNoRepro + Verification
Environment
ghcr.io/openclaw/openclaw:latestbase imageopenai-codexwithopenai-codex-responsesAPI, OAuth-authed ChatGPT Plus subscriptionopenclaw agent --messageCLI{ "agents": { "list": [{ "id": "<redacted>", "models": [{ "model": "openai-codex/gpt-5.4" }] }], "defaults": { "models": { "openai-codex/gpt-5.4": { "baseUrl": "http://<redacted-proxy-host>:8417", "compat": { "supportsPromptCacheKey": true } } } } } }Steps
baseUrlpointing at an OpenAI-compatible proxy (e.g. CLIProxy). Use any OpenAI Responses API model (openai-responses,openai-codex-responses,azure-openai-responses).usage.input_tokens_details.cached_tokenson each response (oragentMeta.lastCallUsage.cacheReadin theopenclaw agent --jsonoutput).Expected
supportsPromptCacheKeyunset:cached_tokens = 0on every turn.supportsPromptCacheKey: true:cached_tokens > 0from turn 2 onward (proportional to shared prefix length), subject to OpenAI's documented 1024-token minimum and cache TTL.Actual
Matches expected. Numbers below.
Evidence
Live A/B on a running agent with a ~40K-token shared prefix, 3 consecutive turns per phase, ~10s spacing, same session across both phases:
Turn 3 under the opt-out:
cached_tokens / effective_prefix ≈ 99.7%. Visibleinput_tokensdrops to 127 (from 39937 baseline). Warm-turn wall time is ~18% faster than baseline once the cache is primed. Turn 1 is slower than baseline because the cache is being populated from nothing — this is expected one-time warm-up cost.Payload instrumentation on the running agent (one log entry per
applyOpenAIResponsesPayloadPolicyinvocation) confirmed that the session-derivedprompt_cache_keywas being constructed upstream correctly (keyValue: "<stable-session-id>"on every turn) and deleted by the wrapper under the existing default (stripFires: true). The new opt-out prevents the delete.Full test lane result
pnpm build && pnpm check && pnpm testrun locally on this branch.buildandcheckexit 0 clean.pnpm testsurfaced 24 failing test files across the project matrix, all categorized and verified as unrelated to this change:markdown-it-task-listsui/workspace. The missing dep is imported fromui/src/ui/markdown.ts, untouched by this PR.main@upstream.pnpm build+pnpm testexecution on the same machine (plugin-auto-enable, model-fallback cooldown, gateway-server agent routes, gateway-server plugin HTTP auth, task-registry maintenance, qa-lab bootstrap).src/config/plugin-auto-enable.core.test.ts+src/agents/model-fallback.test.tson this branch in isolation → 93/93 pass in 28.58s (well under the 120s ceiling).main@upstreamor this branch. Consistent with load-/concurrency-induced flakes.src/agents/subagent-registry.persistence.resume.test.ts+src/gateway/server.auth.compat-baseline.test.tson cleanmain@upstream→ 11/11 pass in 12.29s. Same specs on this branch → 11/11 pass in 5.28s.The 5 files I changed are exercised by
provider-attribution.test.ts(27/27 green, including 3 new cases) andopenai-responses-payload-policy.test.ts(4/4 green).Human Verification (required)
shouldStripResponsesPromptCachestilltrueon proxy-like responses endpoints, stillfalseon native endpoints.supportsPromptCacheKey: trueon a proxy-like endpoint — strip disabled;cached_tokens > 0confirmed in live usage responses AND OpenClaw'sagentMeta.lastCallUsage.cacheReadon the next turn.supportsPromptCacheKey: falseon a native endpoint — strip forced; confirmed via the unit test (live verification was not needed for this branch — it is a defensive future-proofing path).compatobject entirely absent (type allows this).compat.supportsPromptCacheKeyexplicitlyundefinedvs omitted.compat.supportsStore(independent flag, no interference — verified by reading the now-adjacent branches inresolveProviderRequestCapabilities).shouldStripResponsesPromptCache, which only affects OpenAI Responses APIs, but I have not exhaustively grepped for side effects.codex review --base origin/main— no Codex access. The PR author will run this before submitting.Review Conversations
Compatibility / Migration
Yes— default behavior unchanged; existing configs keep working identically.Norequired,Yesoptional — operators MAY addcompat.supportsPromptCacheKeyto opt out of the strip.No. Upgrade steps: none. Operators on affected proxies can optionally set the new flag after upgrading.Risks and Mitigations
supportsPromptCacheKey: trueon a proxy that actually rejectsprompt_cache_key, resulting in the original [Bug]: prompt_cache_key not supported by Volcano Engine DeepSeek #48155-style HTTP 400s.supportsPromptCacheKey: falseon a native OpenAI endpoint that would have cached, losing the cache benefit.falsebranch is defensive future-proofing — no known operator needs it today.AI-assisted disclosure
yesfully tested— the new code path is covered by 3 new unit tests, the surroundingresolveProviderRequestCapabilitiessuite is unchanged (27/27 passing), and the change was additionally validated end-to-end against a live OpenAI-Codex agent via a live-patch A/B (see Evidence).