Skip to content

fix(agents): split system prompt cache prefix by transport#59054

Merged
vincentkoc merged 7 commits into
mainfrom
vk/pr-53225-cache-split
Apr 4, 2026
Merged

fix(agents): split system prompt cache prefix by transport#59054
vincentkoc merged 7 commits into
mainfrom
vk/pr-53225-cache-split

Conversation

@vincentkoc

@vincentkoc vincentkoc commented Apr 1, 2026

Copy link
Copy Markdown
Member

Summary

  • Problem: Anthropic-family prompt caching still lost KV reuse when dynamic lab/session/system additions changed, because the current transport stack no longer had the old cache seam and the shared payload policy cache-tagged the whole system block.
  • Why it matters: with many labs and large prompt surfaces, per-turn dynamic suffix churn rewrites expensive stable prompt prefixes and multiplies cache misses.
  • What changed: restored an internal system-prompt cache boundary as a shared helper, moved context-engine prompt additions behind that boundary, split Anthropic system blocks into cached stable and uncached dynamic regions in the shared payload policy, and stripped the internal marker before emission across non-Anthropic transports plus CLI backend system-prompt args.
  • What did NOT change (scope boundary): provider-visible prompt text is unchanged apart from removing the internal marker; no config/env changes were introduced; no new cache writes happen when cacheRetention: "none".

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

Root Cause / Regression History (if applicable)

  • Root cause: the original fix lived in an older wrapper path, but upstream provider transport/KV work centralized Anthropic request shaping in shared payload policy code and removed the effective cache seam from the current path.
  • Secondary regressions: the initial refresh restored the Anthropic seam but missed the OpenAI Completions serializer path, and the CLI backend path still forwarded the internal cache boundary marker unchanged via systemPromptArg.
  • Missing detection / guardrail: there was no current-main regression test ensuring dynamic system additions, including context-engine prepends, stay outside the cached Anthropic prefix while the internal marker is stripped for all other emission paths.
  • Contributing context (if known): upstream merged a large amount of KV/provider transport work after the original PR branch was cut, so the old patch shape no longer matched the live request path.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • src/agents/system-prompt-cache-boundary.test.ts
    • src/agents/anthropic-payload-policy.test.ts
    • src/agents/openai-transport-stream.test.ts
    • src/agents/openai-ws-stream.test.ts
    • src/agents/cli-runner.helpers.test.ts
  • Scenario the test should lock in: Anthropic-family requests cache only the stable system prefix; dynamic lab/session suffix content stays uncached; OpenAI Responses, OpenAI Completions, OpenAI WebSocket, Google transport paths, and CLI backend system-prompt args strip the internal boundary marker before emission.
  • Why this is the smallest reliable guardrail: the regression lives in prompt assembly plus transport/CLI payload shaping, not in end-user workflow code.
  • Existing test that already covers this (if any): prompt stability coverage existed, but it did not protect the transport-layer cache seam.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

Anthropic-family sessions can preserve the stable system prompt prefix across turns again even when lab/group/session additions change later in the prompt. Other request paths keep seeing the same prompt text, with the internal boundary stripped before emission.

Diagram (if applicable)

Before:
stable prompt prefix + dynamic lab/session suffix -> one cached Anthropic system block -> suffix churn rewrites the full prefix
non-Anthropic/CLI emission paths -> some paths still forwarded the raw internal boundary marker

After:
stable prompt prefix -> cached Anthropic block
dynamic lab/session suffix -> uncached Anthropic block
non-Anthropic/CLI emission paths -> same prompt text with boundary stripped before emission

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 25 / pnpm
  • Model/provider: Anthropic-family payload shaping plus OpenAI/Google/WebSocket/CLI strip paths
  • Integration/channel (if any): N/A
  • Relevant config (redacted): prompt boundary present vs absent; cacheRetention long vs none

Steps

  1. Build a system prompt containing the internal cache boundary.
  2. Apply Anthropic payload policy with cache retention enabled.
  3. Apply non-Anthropic transport request builders or CLI backend arg builders using the same prompt.

Expected

  • Anthropic-family payloads split cached stable system content from uncached dynamic suffix content.
  • Other emission paths strip the internal boundary marker and emit unchanged prompt text semantics.

Actual

  • Before this update, current-main shaping cache-tagged the whole Anthropic system block and had no equivalent seam for later dynamic additions.
  • Before the follow-up fixes in this PR, the OpenAI Completions path and the CLI backend systemPromptArg path could still emit the raw internal marker.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: git diff --check; direct module import sanity for touched source files; direct sanity check that OpenAI Completions now strips the internal boundary marker.
  • Edge cases checked: boundary stripped when cache retention is disabled; context-engine prepend path moves additions behind the cache seam; non-Anthropic request builders strip the marker before emission, including OpenAI Completions; CLI backend system-prompt args also strip the marker before emission.
  • What you did not verify: full Vitest lanes, pnpm check, pnpm build, or live provider telemetry in this update path.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: an internal boundary marker could leak into provider requests on an emission path that was missed.
    • Mitigation: Anthropic strips/splits in shared payload policy; OpenAI Responses, OpenAI Completions, OpenAI WebSocket, Google builders, and CLI backend system-prompt args explicitly strip before emission.
  • Risk: moving the seam too early would reduce cache leverage for large stable bootstrap context.
    • Mitigation: the boundary is placed late in buildAgentSystemPrompt() so stable prompt context stays cacheable and only dynamic lab/session additions sit behind it.

@vincentkoc vincentkoc self-assigned this Apr 1, 2026
@vincentkoc

Copy link
Copy Markdown
Member Author

Supersedes #53225. Refreshed onto current main, revalidated locally, and kept the change scoped to Anthropic/Bedrock system-prompt cache splitting with boundary stripping on all paths.

@martingarramon martingarramon left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean approach — splitting at a known boundary is much better than trying to diff system prompt sections across turns.

One question on the hot path: the wrapper is applied unconditionally to ctx.agent.streamFn for all providers, so every request (OpenAI, Ollama, etc.) runs through the onPayload callback to strip the marker even when splitAndCache is false. The per-request cost is low (one .includes() per system block), but this is the streaming hot path. Was there a reason not to gate the wrapper application behind something like system prompt actually contains the marker? Or is the simplicity of always-apply worth the negligible overhead?

Also noticed the test covers the array-system split path and the string-system strip path, but not the case where splitAndCache is true with a plain-string system prompt (the typeof system === "string" branch at line 59 of the new file). That path has a deliberate design choice — falling back to strip instead of split — which seems worth a test to document the intent.

@greptile-apps

greptile-apps Bot commented Apr 4, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR restores an Anthropic-family system prompt cache seam by introducing a new SYSTEM_PROMPT_CACHE_BOUNDARY marker, inserting it in buildAgentSystemPrompt, splitting Anthropic payloads into a cached stable block and an uncached dynamic block, and stripping the marker before emission on Google, OpenAI Responses, and WebSocket paths.

  • P1 – boundary leaks into completions path: buildOpenAICompletionsParams passes the raw context to the external convertMessages from @mariozechner/pi-ai/openai-completions without first stripping the marker (compare line 226 in the same file where convertResponsesMessages explicitly calls stripSystemPromptCacheBoundary). Every Completions-API provider (Mistral, Moonshot, xAI-completions, etc.) will receive `\
\

` embedded in the system message. The PR description states OpenAI builders explicitly strip before emission, but the completions builder is the missed path.

Confidence Score: 4/5

Safe to review but has one active defect: the OpenAI Completions path leaks the internal cache boundary marker into provider requests, contradicting the PR's stated mitigation.

The Anthropic split logic, Google strip, OpenAI Responses strip, and WebSocket strip are all correctly implemented and tested. One transport path — OpenAI Completions (buildOpenAICompletionsParams) — passes the raw context including the boundary marker to an external library without pre-stripping, causing the marker to appear in system messages for all Completions-API providers. This is a present defect in the changed code, not a speculative future risk, so it warrants a P1 and reduces the score to 4.

src/agents/openai-transport-stream.ts — specifically buildOpenAICompletionsParams (around line 1232) needs a stripSystemPromptCacheBoundary call before passing context to the external convertMessages, and a matching test in openai-transport-stream.test.ts.

Comments Outside Diff (1)

  1. src/agents/openai-transport-stream.ts, line 1232 (link)

    P1 Boundary marker leaks into completions path

    convertMessages from @mariozechner/pi-ai/openai-completions receives context with context.systemPrompt still containing the raw <!-- OPENCLAW_CACHE_BOUNDARY --> marker. Unlike convertResponsesMessages (line 226) which calls stripSystemPromptCacheBoundary(context.systemPrompt) before building the messages, this external function has no awareness of the internal marker, so it will embed the literal \n<!-- OPENCLAW_CACHE_BOUNDARY -->\n string in the system role message sent to every OpenAI Completions provider (Mistral, Moonshot, xAI-completions, etc.). The PR description states OpenAI builders explicitly strip before emission, but the completions builder is the missed path.

    A straightforward fix is to strip before constructing params and pass the cleaned context, or strip inside buildOpenAICompletionsParams itself before calling convertMessages.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/agents/openai-transport-stream.ts
    Line: 1232
    
    Comment:
    **Boundary marker leaks into completions path**
    
    `convertMessages` from `@mariozechner/pi-ai/openai-completions` receives `context` with `context.systemPrompt` still containing the raw `<!-- OPENCLAW_CACHE_BOUNDARY -->` marker. Unlike `convertResponsesMessages` (line 226) which calls `stripSystemPromptCacheBoundary(context.systemPrompt)` before building the messages, this external function has no awareness of the internal marker, so it will embed the literal `\n<!-- OPENCLAW_CACHE_BOUNDARY -->\n` string in the `system` role message sent to every OpenAI Completions provider (Mistral, Moonshot, xAI-completions, etc.). The PR description states OpenAI builders explicitly strip before emission, but the completions builder is the missed path.
    
    A straightforward fix is to strip before constructing params and pass the cleaned context, or strip inside `buildOpenAICompletionsParams` itself before calling `convertMessages`.
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/openai-transport-stream.ts
Line: 1232

Comment:
**Boundary marker leaks into completions path**

`convertMessages` from `@mariozechner/pi-ai/openai-completions` receives `context` with `context.systemPrompt` still containing the raw `<!-- OPENCLAW_CACHE_BOUNDARY -->` marker. Unlike `convertResponsesMessages` (line 226) which calls `stripSystemPromptCacheBoundary(context.systemPrompt)` before building the messages, this external function has no awareness of the internal marker, so it will embed the literal `\n<!-- OPENCLAW_CACHE_BOUNDARY -->\n` string in the `system` role message sent to every OpenAI Completions provider (Mistral, Moonshot, xAI-completions, etc.). The PR description states OpenAI builders explicitly strip before emission, but the completions builder is the missed path.

A straightforward fix is to strip before constructing params and pass the cleaned context, or strip inside `buildOpenAICompletionsParams` itself before calling `convertMessages`.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/agents/openai-transport-stream.test.ts
Line: 443-466

Comment:
**No boundary-stripping test for the completions path**

The existing test covers `buildOpenAIResponsesParams` stripping the cache boundary, but there is no equivalent test for `buildOpenAICompletionsParams`. This is the coverage gap that allows the leak described above to go undetected; adding a parallel test case for `buildOpenAICompletionsParams` with a boundary-containing `systemPrompt` would lock in the fix once the stripping logic is added.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Merge branch 'main' into vk/pr-53225-cac..." | Re-trigger Greptile

Comment on lines +443 to +466
it("strips the internal cache boundary from OpenAI system prompts", () => {
const params = buildOpenAIResponsesParams(
{
id: "gpt-5.4",
name: "GPT-5.4",
api: "openai-responses",
provider: "openai",
baseUrl: "https://api.openai.com/v1",
reasoning: true,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 200000,
maxTokens: 8192,
} satisfies Model<"openai-responses">,
{
systemPrompt: `Stable prefix${SYSTEM_PROMPT_CACHE_BOUNDARY}Dynamic suffix`,
messages: [],
tools: [],
} as never,
undefined,
) as { input?: Array<{ content?: string }> };

expect(params.input?.[0]?.content).toBe("Stable prefix\nDynamic suffix");
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No boundary-stripping test for the completions path

The existing test covers buildOpenAIResponsesParams stripping the cache boundary, but there is no equivalent test for buildOpenAICompletionsParams. This is the coverage gap that allows the leak described above to go undetected; adding a parallel test case for buildOpenAICompletionsParams with a boundary-containing systemPrompt would lock in the fix once the stripping logic is added.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/openai-transport-stream.test.ts
Line: 443-466

Comment:
**No boundary-stripping test for the completions path**

The existing test covers `buildOpenAIResponsesParams` stripping the cache boundary, but there is no equivalent test for `buildOpenAICompletionsParams`. This is the coverage gap that allows the leak described above to go undetected; adding a parallel test case for `buildOpenAICompletionsParams` with a boundary-containing `systemPrompt` would lock in the fix once the stripping logic is added.

How can I resolve this? If you propose a fix, please make it concise.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b5dd1fee90

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/agents/openai-transport-stream.ts
Comment thread src/agents/system-prompt.ts
@vincentkoc vincentkoc changed the title fix(agents): split Anthropic system prompt cache prefix fix(agents): split system prompt cache prefix by transport Apr 4, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c90fd94a64

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// Keep large stable prompt context above this seam so Anthropic-family
// transports can reuse it across labs and turns. Dynamic group/session
// additions below it are the primary cache invalidators.
lines.push(SYSTEM_PROMPT_CACHE_BOUNDARY);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Route default streamSimple calls through boundary stripping

buildAgentSystemPrompt now inserts SYSTEM_PROMPT_CACHE_BOUNDARY unconditionally, but the marker is only removed in transport-specific builders. For standard sessions without request.proxy/request.tls, registerProviderStreamForModel does not install those transport wrappers (src/agents/provider-transport-stream.ts), and resolveEmbeddedAgentStreamFn falls back to currentStreamFn/streamSimple (src/agents/pi-embedded-runner/run/attempt.ts:243-257), so provider requests still receive the raw <!-- OPENCLAW_CACHE_BOUNDARY --> text. This is a prompt-content regression on default provider paths and also skips the Anthropic stable/dynamic cache split outside the custom transport flow.

Useful? React with 👍 / 👎.

@vincentkoc vincentkoc force-pushed the vk/pr-53225-cache-split branch from 93e191d to 2614d0f Compare April 4, 2026 03:39

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2614d0fc0c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// Keep large stable prompt context above this seam so Anthropic-family
// transports can reuse it across labs and turns. Dynamic group/session
// additions below it are the primary cache invalidators.
lines.push(SYSTEM_PROMPT_CACHE_BOUNDARY);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Strip cache boundary before provider-owned stream calls

buildAgentSystemPrompt now unconditionally inserts SYSTEM_PROMPT_CACHE_BOUNDARY, but provider-owned createStreamFn paths are not routed through the new boundary-stripping transports. For example, registerProviderStreamForModel can return the Ollama plugin stream (extensions/ollama/index.ts), and that stream forwards context.systemPrompt directly into the outbound system message (extensions/ollama/src/stream.ts), so requests will now include the internal <!-- OPENCLAW_CACHE_BOUNDARY --> marker in provider-visible prompt text. This is a prompt-content regression for bundled/custom provider plugins that do not implement their own stripping.

Useful? React with 👍 / 👎.

@blacksmith-sh

This comment has been minimized.

@vincentkoc vincentkoc merged commit 64f2890 into main Apr 4, 2026
7 checks passed
@vincentkoc vincentkoc deleted the vk/pr-53225-cache-split branch April 4, 2026 04:32

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7340201cc0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +233 to +237
context.systemPrompt
? {
...context,
systemPrompt: stripSystemPromptCacheBoundary(context.systemPrompt),
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve Anthropic cache seam in provider stream wrapper

resolveEmbeddedAgentStreamFn now strips SYSTEM_PROMPT_CACHE_BOUNDARY for every providerStreamFn call, but Anthropic proxy/TLS models reach this branch via registerProviderStreamForModel and rely on applyAnthropicPayloadPolicyToParams to split cached stable system text from dynamic suffix text. Stripping first removes that seam, so those Anthropic requests revert to cache-tagging the whole system block and lose prefix cache reuse whenever dynamic additions change.

Useful? React with 👍 / 👎.

Comment on lines +71 to +72
if (!isTransportAwareApiSupported(model.api)) {
return undefined;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle boundary stripping for non-transport-aware APIs

createBoundaryAwareStreamFnForModel returns undefined for APIs outside SUPPORTED_TRANSPORT_APIS, so those sessions fall back to streamSimple without boundary cleanup. Since the resolver still accepts APIs like bedrock-converse-stream, github-copilot, and openai-codex-responses (src/agents/pi-embedded-runner/model.ts:103-114), those providers now receive the internal <!-- OPENCLAW_CACHE_BOUNDARY --> marker in prompt text.

Useful? React with 👍 / 👎.

KimGLee pushed a commit to KimGLee/openclaw that referenced this pull request Apr 4, 2026
…59054)

* fix(agents): restore Anthropic prompt cache seam

* fix(agents): strip cache boundary for completions

* fix(agents): strip cache boundary for cli backends

* chore(changelog): note cross-transport cache boundary rollout

* fix(agents): route default stream fallbacks through boundary shapers

* fix(agents): strip cache boundary for provider streams
lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026
…59054)

* fix(agents): restore Anthropic prompt cache seam

* fix(agents): strip cache boundary for completions

* fix(agents): strip cache boundary for cli backends

* chore(changelog): note cross-transport cache boundary rollout

* fix(agents): route default stream fallbacks through boundary shapers

* fix(agents): strip cache boundary for provider streams
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
…59054)

* fix(agents): restore Anthropic prompt cache seam

* fix(agents): strip cache boundary for completions

* fix(agents): strip cache boundary for cli backends

* chore(changelog): note cross-transport cache boundary rollout

* fix(agents): route default stream fallbacks through boundary shapers

* fix(agents): strip cache boundary for provider streams
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
…59054)

* fix(agents): restore Anthropic prompt cache seam

* fix(agents): strip cache boundary for completions

* fix(agents): strip cache boundary for cli backends

* chore(changelog): note cross-transport cache boundary rollout

* fix(agents): route default stream fallbacks through boundary shapers

* fix(agents): strip cache boundary for provider streams
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
…59054)

* fix(agents): restore Anthropic prompt cache seam

* fix(agents): strip cache boundary for completions

* fix(agents): strip cache boundary for cli backends

* chore(changelog): note cross-transport cache boundary rollout

* fix(agents): route default stream fallbacks through boundary shapers

* fix(agents): strip cache boundary for provider streams
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling maintainer Maintainer-authored PR size: L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cache_control not applied to system prompt on direct Anthropic provider path (cacheRead=0) [Feature]: Multi-Block Prompt Caching for Anthropic Models

2 participants