Skip to content

fix: strip prompt_cache_key for non-OpenAI openai-responses endpoints#49877

Merged
frankekn merged 2 commits into
openclaw:mainfrom
ShaunTsai:fix/strip-prompt-cache-non-openai
Mar 19, 2026
Merged

fix: strip prompt_cache_key for non-OpenAI openai-responses endpoints#49877
frankekn merged 2 commits into
openclaw:mainfrom
ShaunTsai:fix/strip-prompt-cache-non-openai

Conversation

@ShaunTsai

Copy link
Copy Markdown
Contributor

Summary

  • Problem: Volcano Engine DeepSeek (and other non-OpenAI providers using the openai-responses API) returns HTTP 400 unknown field "prompt_cache_key" because pi-ai unconditionally injects prompt_cache_key and prompt_cache_retention into OpenAI Responses request bodies.
  • Why it matters: users configuring Volcano Engine models cannot use them at all — every request fails with a 400.
  • What changed: added prompt_cache_key/prompt_cache_retention stripping to the existing createOpenAIResponsesContextManagementWrapper in openai-stream-wrappers.ts, using the existing isDirectOpenAIBaseUrl() check to determine whether the endpoint actually supports these fields.
  • What did NOT change (scope boundary): prompt caching behavior for direct OpenAI, Azure OpenAI, and GitHub Copilot endpoints (they pass the isDirectOpenAIBaseUrl check). Anthropic caching uses a different mechanism (cacheRetention option) and is unaffected. The existing createBedrockNoCacheWrapper for non-Anthropic Bedrock models is also unchanged. No changes to extra-params.ts.

Change Type (select all)

  • Bug fix

Scope (select all touched areas)

  • Gateway / orchestration

Linked Issue/PR

User-visible / Behavior Changes

Volcano Engine DeepSeek (and other non-OpenAI providers using openai-responses API) will no longer fail with HTTP 400 on unknown prompt_cache_key field.

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22+
  • Model/provider: volces/deepseek-v3-2-251201

Steps

  1. Configure a volces provider with deepseek-v3-2-251201 model
  2. Send a message to the agent

Expected

  • Agent responds normally

Actual

  • HTTP 400: unknown field "prompt_cache_key"

Evidence

  • Code inspection: traced prompt_cache_key injection from pi-ai openai-responses stream through to the HTTP request body; confirmed the field is only meaningful for OpenAI's own API
  • Verified the isDirectOpenAIBaseUrl() check correctly identifies OpenAI (api.openai.com), ChatGPT (chatgpt.com), and Azure OpenAI (*.openai.azure.com) endpoints — all other baseUrls (including Volcano Engine) return false

Human Verification (required)

  • Verified scenarios: traced the full code path from applyExtraParamsToAgentcreateOpenAIResponsesContextManagementWrapperapplyOpenAIResponsesPayloadOverrides; confirmed stripPromptCache is true for non-direct-OpenAI endpoints and false for direct OpenAI
  • Edge cases checked: azure-openai provider with *.openai.azure.com baseUrl passes isDirectOpenAIBaseUrl and keeps prompt cache fields; providers with no baseUrl (empty string) get fields stripped (safe — no-op since pi-ai only injects for openai-responses)
  • What I did not verify: live Volcano Engine API call (no credentials available); full pnpm check currently has pre-existing typing debt failures on main unrelated to this PR

AI Disclosure

  • AI-assisted (Kiro CLI)
  • I understand what the code does

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No

Failure Recovery (if this breaks)

  • How to disable/revert: revert this commit
  • Files/config to restore: src/agents/pi-embedded-runner/openai-stream-wrappers.ts
  • Known bad symptoms: if isDirectOpenAIBaseUrl fails to recognize a legitimate OpenAI endpoint, prompt caching would silently stop working for that endpoint

Risks and Mitigations

  • Risk: a new OpenAI-compatible endpoint hostname (not api.openai.com, chatgpt.com, or *.openai.azure.com) would need isDirectOpenAIBaseUrl updated.
    • Mitigation: this is the same function already used for store field decisions — any such endpoint would already be broken for store: true forcing, so the fix would naturally cover both.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XS labels Mar 18, 2026
@greptile-apps

greptile-apps Bot commented Mar 18, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a real user-facing bug where third-party providers (e.g. Volcano Engine DeepSeek) using the openai-responses API format received HTTP 400 errors because pi-ai unconditionally injected prompt_cache_key and prompt_cache_retention fields that are only meaningful for OpenAI's own API.

What changed:

  • A new stripPromptCache boolean is computed in createOpenAIResponsesContextManagementWrapper using the already-established isDirectOpenAIBaseUrl() guard.
  • applyOpenAIResponsesPayloadOverrides now deletes prompt_cache_key and prompt_cache_retention from the payload when stripPromptCache is true.
  • The early-return optimization gate at line 313 is extended with !stripPromptCache so the wrapper is correctly engaged for affected non-OpenAI providers.

Assessment:

  • The implementation is consistent with the existing patterns (shouldForceResponsesStore, shouldStripResponsesStore) and correctly scopes the change to openai-responses streams only.
  • The fix correctly preserves prompt caching for api.openai.com, chatgpt.com, and *.openai.azure.com endpoints (all pass isDirectOpenAIBaseUrl).
  • The main gap is the absence of unit tests for the new behavior; createOpenAIResponsesContextManagementWrapper has no dedicated test file, so a future regression could silently break prompt caching for legitimate OpenAI users without any observable error.

Confidence Score: 4/5

  • Safe to merge — the fix is scoped, logic is sound, and the only concern is missing test coverage for the new behavior.
  • The change is small, targeted, and reuses an already-trusted helper (isDirectOpenAIBaseUrl). The fix correctly solves the reported HTTP 400 regression for Volcano Engine and similar providers without altering the path for direct OpenAI, Azure, or ChatGPT endpoints. Score is 4 rather than 5 solely because there are no unit tests covering the new stripPromptCache logic, making it harder to prevent future regressions to OpenAI prompt-caching behavior.
  • No files require special attention beyond the noted lack of test coverage.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/openai-stream-wrappers.ts
Line: 309-312

Comment:
**No test coverage for stripPromptCache**

The new `stripPromptCache` behavior has no test coverage. Given that:
1. A regression here would silently break prompt caching for all direct OpenAI users without any observable error.
2. The existing `createOpenAIResponsesContextManagementWrapper` has no dedicated test file (confirmed by the absence of `openai-stream-wrappers.test.ts`).

It would be valuable to add unit tests covering at least:
- Volcano Engine (or any non-`isDirectOpenAIBaseUrl` provider) — `prompt_cache_key` and `prompt_cache_retention` are stripped from the payload.
- Native OpenAI (`baseUrl: "https://api.openai.com/v1"`) — fields are **not** stripped.
- Azure OpenAI (`baseUrl: "https://myinstance.openai.azure.com/..."`) — fields are **not** stripped.
- Model with `api !== "openai-responses"``stripPromptCache` is `false` (no unnecessary wrapper overhead).

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "fix: strip prompt_ca..."

Comment on lines +309 to +312
const stripPromptCache =
typeof model.api === "string" &&
OPENAI_RESPONSES_APIS.has(model.api) &&
!isDirectOpenAIBaseUrl(model.baseUrl);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No test coverage for stripPromptCache

The new stripPromptCache behavior has no test coverage. Given that:

  1. A regression here would silently break prompt caching for all direct OpenAI users without any observable error.
  2. The existing createOpenAIResponsesContextManagementWrapper has no dedicated test file (confirmed by the absence of openai-stream-wrappers.test.ts).

It would be valuable to add unit tests covering at least:

  • Volcano Engine (or any non-isDirectOpenAIBaseUrl provider) — prompt_cache_key and prompt_cache_retention are stripped from the payload.
  • Native OpenAI (baseUrl: "https://api.openai.com/v1") — fields are not stripped.
  • Azure OpenAI (baseUrl: "https://myinstance.openai.azure.com/...") — fields are not stripped.
  • Model with api !== "openai-responses"stripPromptCache is false (no unnecessary wrapper overhead).
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/openai-stream-wrappers.ts
Line: 309-312

Comment:
**No test coverage for stripPromptCache**

The new `stripPromptCache` behavior has no test coverage. Given that:
1. A regression here would silently break prompt caching for all direct OpenAI users without any observable error.
2. The existing `createOpenAIResponsesContextManagementWrapper` has no dedicated test file (confirmed by the absence of `openai-stream-wrappers.test.ts`).

It would be valuable to add unit tests covering at least:
- Volcano Engine (or any non-`isDirectOpenAIBaseUrl` provider) — `prompt_cache_key` and `prompt_cache_retention` are stripped from the payload.
- Native OpenAI (`baseUrl: "https://api.openai.com/v1"`) — fields are **not** stripped.
- Azure OpenAI (`baseUrl: "https://myinstance.openai.azure.com/..."`) — fields are **not** stripped.
- Model with `api !== "openai-responses"``stripPromptCache` is `false` (no unnecessary wrapper overhead).

How can I resolve this? If you propose a fix, please make it concise.

@frankekn frankekn self-assigned this Mar 19, 2026
@frankekn

Copy link
Copy Markdown
Contributor

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

ShaunTsai and others added 2 commits March 19, 2026 14:51
pi-ai unconditionally injects prompt_cache_key and prompt_cache_retention
into openai-responses request bodies. Third-party providers using the
openai-responses API (e.g. Volcano Engine DeepSeek) reject these unknown
fields with HTTP 400.

Instead of maintaining a separate provider allowlist, fold the stripping
logic into the existing createOpenAIResponsesContextManagementWrapper
which already handles openai-responses payload overrides. Use the
existing isDirectOpenAIBaseUrl() check to determine whether the endpoint
actually supports these fields.

Fixes openclaw#48155
@frankekn frankekn force-pushed the fix/strip-prompt-cache-non-openai branch from 834cd1c to 7185eb5 Compare March 19, 2026 06:52
@frankekn frankekn merged commit bcc725f into openclaw:main Mar 19, 2026
20 of 41 checks passed
@frankekn

Copy link
Copy Markdown
Contributor

Thanks @ShaunTsai. Landed in bcc725f from source head ShaunTsai@7185eb5 .

fuller-stack-dev pushed a commit to fuller-stack-dev/openclaw that referenced this pull request Mar 20, 2026
…penclaw#49877) thanks @ShaunTsai

Fixes openclaw#48155

Co-authored-by: Shaun Tsai <13811075+ShaunTsai@users.noreply.github.com>
Co-authored-by: frankekn <4488090+frankekn@users.noreply.github.com>
fuller-stack-dev pushed a commit to fuller-stack-dev/openclaw that referenced this pull request Mar 20, 2026
…penclaw#49877) thanks @ShaunTsai

Fixes openclaw#48155

Co-authored-by: Shaun Tsai <13811075+ShaunTsai@users.noreply.github.com>
Co-authored-by: frankekn <4488090+frankekn@users.noreply.github.com>
lanyasheng added a commit to lanyasheng/openclaw that referenced this pull request Mar 22, 2026
…enAI proxies

PR openclaw#49877 introduced shouldStripResponsesPromptCache() which strips
prompt_cache_key and prompt_cache_retention from requests to non-direct
OpenAI endpoints. While this prevents 400 errors from endpoints that
do not support these fields, it also disables prompt caching for
third-party proxies that forward to real OpenAI APIs.

Add a compat.supportsPromptCache boolean opt-in that lets users declare
their proxy backend supports OpenAI prompt caching, preserving the
cache fields in the request payload.

Refs openclaw#48155
frankekn added a commit to artwalker/openclaw that referenced this pull request Mar 23, 2026
…penclaw#49877) thanks @ShaunTsai

Fixes openclaw#48155

Co-authored-by: Shaun Tsai <13811075+ShaunTsai@users.noreply.github.com>
Co-authored-by: frankekn <4488090+frankekn@users.noreply.github.com>
@yeahfeng

Copy link
Copy Markdown

This is a rather “costly” fix. As noted in #52017, isn’t the real issue that Volcengine DeepSeek doesn’t fully support the OpenAI Responses API, and that OpenClaw was instead changed in a way that makes it no longer fully compatible?

steipete pushed a commit to damselem/openclaw that referenced this pull request Apr 16, 2026
…ache_key strip

The strip introduced in openclaw#49877 (fixes openclaw#48155 — Volcano Engine DeepSeek
rejects prompt_cache_key with HTTP 400) applies to every non-native
baseUrl via the usesExplicitProxyLikeEndpoint gate. This catches OpenAI-
compatible proxies that DO forward prompt_cache_key correctly — CLIProxy,
LiteLLM, vLLM, etc. — and disables prompt caching for agents routed
through them.

Empirical impact on a CLIProxy-routed agent with a ~40K-token prefix:
cached_tokens stays at 0 across consecutive turns on the same session,
despite a stable prefix and a stable session-derived prompt_cache_key
being constructed upstream in buildOpenAIResponsesParams.

Add compat.supportsPromptCacheKey to let operators override the default:
- true  -> never strip (endpoint is known-compatible, e.g. CLIProxy)
- false -> always strip for openai-responses APIs, even on native
           endpoints (endpoint is known-incompatible)
- undefined (default) -> existing behavior preserved: strip only on
           proxy-like endpoints, so the Volcano Engine fix from openclaw#48155
           continues to work without configuration.

Validated end-to-end on the maintainer fleet via A/B test on a running
agent: before the opt-out, cacheRead=0 across 3 consecutive turns
(input_tokens 39856 / 39896 / 39937). After setting
compat.supportsPromptCacheKey=true, cacheRead=32256 / 39808 / 39936 and
visible input_tokens drops to 7723 / 213 / 127 (99.7% prefix cache hit
by turn 3).

Adds 3 unit tests to provider-attribution.test.ts covering the override,
the forced-strip path, and the preserved default.
lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026
…penclaw#49877) thanks @ShaunTsai

Fixes openclaw#48155

Co-authored-by: Shaun Tsai <13811075+ShaunTsai@users.noreply.github.com>
Co-authored-by: frankekn <4488090+frankekn@users.noreply.github.com>
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
…penclaw#49877) thanks @ShaunTsai

Fixes openclaw#48155

Co-authored-by: Shaun Tsai <13811075+ShaunTsai@users.noreply.github.com>
Co-authored-by: frankekn <4488090+frankekn@users.noreply.github.com>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
…penclaw#49877) thanks @ShaunTsai

Fixes openclaw#48155

Co-authored-by: Shaun Tsai <13811075+ShaunTsai@users.noreply.github.com>
Co-authored-by: frankekn <4488090+frankekn@users.noreply.github.com>
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
…penclaw#49877) thanks @ShaunTsai

Fixes openclaw#48155

Co-authored-by: Shaun Tsai <13811075+ShaunTsai@users.noreply.github.com>
Co-authored-by: frankekn <4488090+frankekn@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: prompt_cache_key not supported by Volcano Engine DeepSeek

3 participants