fix(agents): split system prompt cache prefix by transport#59054
Conversation
|
Supersedes #53225. Refreshed onto current main, revalidated locally, and kept the change scoped to Anthropic/Bedrock system-prompt cache splitting with boundary stripping on all paths. |
martingarramon
left a comment
There was a problem hiding this comment.
Clean approach — splitting at a known boundary is much better than trying to diff system prompt sections across turns.
One question on the hot path: the wrapper is applied unconditionally to ctx.agent.streamFn for all providers, so every request (OpenAI, Ollama, etc.) runs through the onPayload callback to strip the marker even when splitAndCache is false. The per-request cost is low (one .includes() per system block), but this is the streaming hot path. Was there a reason not to gate the wrapper application behind something like system prompt actually contains the marker? Or is the simplicity of always-apply worth the negligible overhead?
Also noticed the test covers the array-system split path and the string-system strip path, but not the case where splitAndCache is true with a plain-string system prompt (the typeof system === "string" branch at line 59 of the new file). That path has a deliberate design choice — falling back to strip instead of split — which seems worth a test to document the intent.
acf2f84 to
f3be414
Compare
Greptile SummaryThis PR restores an Anthropic-family system prompt cache seam by introducing a new
` embedded in the system message. The PR description states OpenAI builders explicitly strip before emission, but the completions builder is the missed path. Confidence Score: 4/5Safe to review but has one active defect: the OpenAI Completions path leaks the internal cache boundary marker into provider requests, contradicting the PR's stated mitigation. The Anthropic split logic, Google strip, OpenAI Responses strip, and WebSocket strip are all correctly implemented and tested. One transport path — OpenAI Completions ( src/agents/openai-transport-stream.ts — specifically
|
| it("strips the internal cache boundary from OpenAI system prompts", () => { | ||
| const params = buildOpenAIResponsesParams( | ||
| { | ||
| id: "gpt-5.4", | ||
| name: "GPT-5.4", | ||
| api: "openai-responses", | ||
| provider: "openai", | ||
| baseUrl: "https://api.openai.com/v1", | ||
| reasoning: true, | ||
| input: ["text"], | ||
| cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }, | ||
| contextWindow: 200000, | ||
| maxTokens: 8192, | ||
| } satisfies Model<"openai-responses">, | ||
| { | ||
| systemPrompt: `Stable prefix${SYSTEM_PROMPT_CACHE_BOUNDARY}Dynamic suffix`, | ||
| messages: [], | ||
| tools: [], | ||
| } as never, | ||
| undefined, | ||
| ) as { input?: Array<{ content?: string }> }; | ||
|
|
||
| expect(params.input?.[0]?.content).toBe("Stable prefix\nDynamic suffix"); | ||
| }); |
There was a problem hiding this comment.
No boundary-stripping test for the completions path
The existing test covers buildOpenAIResponsesParams stripping the cache boundary, but there is no equivalent test for buildOpenAICompletionsParams. This is the coverage gap that allows the leak described above to go undetected; adding a parallel test case for buildOpenAICompletionsParams with a boundary-containing systemPrompt would lock in the fix once the stripping logic is added.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/openai-transport-stream.test.ts
Line: 443-466
Comment:
**No boundary-stripping test for the completions path**
The existing test covers `buildOpenAIResponsesParams` stripping the cache boundary, but there is no equivalent test for `buildOpenAICompletionsParams`. This is the coverage gap that allows the leak described above to go undetected; adding a parallel test case for `buildOpenAICompletionsParams` with a boundary-containing `systemPrompt` would lock in the fix once the stripping logic is added.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b5dd1fee90
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c90fd94a64
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Keep large stable prompt context above this seam so Anthropic-family | ||
| // transports can reuse it across labs and turns. Dynamic group/session | ||
| // additions below it are the primary cache invalidators. | ||
| lines.push(SYSTEM_PROMPT_CACHE_BOUNDARY); |
There was a problem hiding this comment.
Route default streamSimple calls through boundary stripping
buildAgentSystemPrompt now inserts SYSTEM_PROMPT_CACHE_BOUNDARY unconditionally, but the marker is only removed in transport-specific builders. For standard sessions without request.proxy/request.tls, registerProviderStreamForModel does not install those transport wrappers (src/agents/provider-transport-stream.ts), and resolveEmbeddedAgentStreamFn falls back to currentStreamFn/streamSimple (src/agents/pi-embedded-runner/run/attempt.ts:243-257), so provider requests still receive the raw <!-- OPENCLAW_CACHE_BOUNDARY --> text. This is a prompt-content regression on default provider paths and also skips the Anthropic stable/dynamic cache split outside the custom transport flow.
Useful? React with 👍 / 👎.
93e191d to
2614d0f
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2614d0fc0c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Keep large stable prompt context above this seam so Anthropic-family | ||
| // transports can reuse it across labs and turns. Dynamic group/session | ||
| // additions below it are the primary cache invalidators. | ||
| lines.push(SYSTEM_PROMPT_CACHE_BOUNDARY); |
There was a problem hiding this comment.
Strip cache boundary before provider-owned stream calls
buildAgentSystemPrompt now unconditionally inserts SYSTEM_PROMPT_CACHE_BOUNDARY, but provider-owned createStreamFn paths are not routed through the new boundary-stripping transports. For example, registerProviderStreamForModel can return the Ollama plugin stream (extensions/ollama/index.ts), and that stream forwards context.systemPrompt directly into the outbound system message (extensions/ollama/src/stream.ts), so requests will now include the internal <!-- OPENCLAW_CACHE_BOUNDARY --> marker in provider-visible prompt text. This is a prompt-content regression for bundled/custom provider plugins that do not implement their own stripping.
Useful? React with 👍 / 👎.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7340201cc0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| context.systemPrompt | ||
| ? { | ||
| ...context, | ||
| systemPrompt: stripSystemPromptCacheBoundary(context.systemPrompt), | ||
| } |
There was a problem hiding this comment.
Preserve Anthropic cache seam in provider stream wrapper
resolveEmbeddedAgentStreamFn now strips SYSTEM_PROMPT_CACHE_BOUNDARY for every providerStreamFn call, but Anthropic proxy/TLS models reach this branch via registerProviderStreamForModel and rely on applyAnthropicPayloadPolicyToParams to split cached stable system text from dynamic suffix text. Stripping first removes that seam, so those Anthropic requests revert to cache-tagging the whole system block and lose prefix cache reuse whenever dynamic additions change.
Useful? React with 👍 / 👎.
| if (!isTransportAwareApiSupported(model.api)) { | ||
| return undefined; |
There was a problem hiding this comment.
Handle boundary stripping for non-transport-aware APIs
createBoundaryAwareStreamFnForModel returns undefined for APIs outside SUPPORTED_TRANSPORT_APIS, so those sessions fall back to streamSimple without boundary cleanup. Since the resolver still accepts APIs like bedrock-converse-stream, github-copilot, and openai-codex-responses (src/agents/pi-embedded-runner/model.ts:103-114), those providers now receive the internal <!-- OPENCLAW_CACHE_BOUNDARY --> marker in prompt text.
Useful? React with 👍 / 👎.
…59054) * fix(agents): restore Anthropic prompt cache seam * fix(agents): strip cache boundary for completions * fix(agents): strip cache boundary for cli backends * chore(changelog): note cross-transport cache boundary rollout * fix(agents): route default stream fallbacks through boundary shapers * fix(agents): strip cache boundary for provider streams
…59054) * fix(agents): restore Anthropic prompt cache seam * fix(agents): strip cache boundary for completions * fix(agents): strip cache boundary for cli backends * chore(changelog): note cross-transport cache boundary rollout * fix(agents): route default stream fallbacks through boundary shapers * fix(agents): strip cache boundary for provider streams
…59054) * fix(agents): restore Anthropic prompt cache seam * fix(agents): strip cache boundary for completions * fix(agents): strip cache boundary for cli backends * chore(changelog): note cross-transport cache boundary rollout * fix(agents): route default stream fallbacks through boundary shapers * fix(agents): strip cache boundary for provider streams
…59054) * fix(agents): restore Anthropic prompt cache seam * fix(agents): strip cache boundary for completions * fix(agents): strip cache boundary for cli backends * chore(changelog): note cross-transport cache boundary rollout * fix(agents): route default stream fallbacks through boundary shapers * fix(agents): strip cache boundary for provider streams
…59054) * fix(agents): restore Anthropic prompt cache seam * fix(agents): strip cache boundary for completions * fix(agents): strip cache boundary for cli backends * chore(changelog): note cross-transport cache boundary rollout * fix(agents): route default stream fallbacks through boundary shapers * fix(agents): strip cache boundary for provider streams
Summary
cacheRetention: "none".Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause / Regression History (if applicable)
systemPromptArg.Regression Test Plan (if applicable)
src/agents/system-prompt-cache-boundary.test.tssrc/agents/anthropic-payload-policy.test.tssrc/agents/openai-transport-stream.test.tssrc/agents/openai-ws-stream.test.tssrc/agents/cli-runner.helpers.test.tsUser-visible / Behavior Changes
Anthropic-family sessions can preserve the stable system prompt prefix across turns again even when lab/group/session additions change later in the prompt. Other request paths keep seeing the same prompt text, with the internal boundary stripped before emission.
Diagram (if applicable)
Security Impact (required)
No)No)No)No)No)Yes, explain risk + mitigation: N/ARepro + Verification
Environment
cacheRetentionlong vs noneSteps
Expected
Actual
systemPromptArgpath could still emit the raw internal marker.Evidence
Attach at least one:
Human Verification (required)
git diff --check; direct module import sanity for touched source files; direct sanity check that OpenAI Completions now strips the internal boundary marker.pnpm check,pnpm build, or live provider telemetry in this update path.Review Conversations
If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.
Compatibility / Migration
Yes)No)No)Risks and Mitigations
buildAgentSystemPrompt()so stable prompt context stays cacheable and only dynamic lab/session additions sit behind it.