Skip to content

fix(anthropic-vertex): stop re-marking cache_control on transport-budgeted payloads#92387

Merged
clawsweeper[bot] merged 1 commit into
openclaw:mainfrom
openperf:fix/91982-vertex-cache-control-double-apply
Jun 12, 2026
Merged

fix(anthropic-vertex): stop re-marking cache_control on transport-budgeted payloads#92387
clawsweeper[bot] merged 1 commit into
openclaw:mainfrom
openperf:fix/91982-vertex-cache-control-double-apply

Conversation

@openperf

Copy link
Copy Markdown
Member

Summary

  • Problem: Issue [Bug]: anthropic-vertex-provider adds cache_control to active-memory system block — triggers "Found 5" error when active-memory is enabled #91982 reports that anthropic-vertex requests are intermittently rejected with FailoverError: LLM request rejected: A maximum of 4 blocks with cache_control may be provided. Found 5. In the reporter's production deployment the rejection fires whenever active-memory recall injects context into the turn, forcing a model fallback on every hit; the only workaround is disabling active-memory.
  • Root cause: the request passes through two cache-control writers. The shared anthropic-messages transport (src/llm/providers/anthropic.ts) builds the payload with full marker budgeting: it splits the system prompt at the cache boundary into a stable prefix (with cache_control) and an intentionally uncached dynamic suffix, marks the tool array, and gives the message pass only the remaining budget (4 - system - tools), so legitimate payloads can carry exactly four markers. The vertex plugin then ran applyAnthropicPayloadPolicyToParams again on that finished payload inside its onPayload hook (createAnthropicVertexOnPayload in extensions/anthropic-vertex/stream-runtime.ts). That helper is written for raw payloads (the agents-side anthropic-transport-stream.ts builds its system blocks with no markers and boundary text intact): any system text block with no boundary marker and no cache_control gets a marker added unconditionally. Post-transport, the dynamic suffix is exactly such a block, so the shim adds a fifth marker and Anthropic rejects the request. Turns that don't fill the budget stay at four and pass, which is why the failure tracks active-memory hits rather than every request.
  • Provenance: the shim was added in 1a13c34f5b ("close cache boundary transport gaps"), when the then-external transport did not understand the OpenClaw cache boundary and the plugin-level policy pass was the only splitter. eef24d452f ("preserve provider prompt cache boundaries", shipped in v2026.6.2-beta.1) moved boundary splitting and marker budgeting natively into the shared transport, which turned the shim from a gap-closer into a double-application. The reporter's first failures on 2026.6.5 line up with that release.
  • Fix: delete the shim and forward the caller's onPayload hook unchanged. The shared transport is the single owner of cache-control budgeting for this API family, which is already how the plain anthropic provider behaves (its wrapper applies the payload policy only for service_tier, without enableCacheControl).
  • What changed: extensions/anthropic-vertex/stream-runtime.ts removes createAnthropicVertexOnPayload and its policy imports and passes options?.onPayload straight through (net −35 lines); the stale tests that encoded the pre-split payload shape are replaced with regression tests for the budgeted-payload invariant.
  • What did NOT change (scope boundary): config surface unchanged (no schema, defaults, doctor migrations, or docs/reference/config). Plugin surface unchanged (no exported signature, manifest, api.ts, or SDK changes; createAnthropicVertexStreamFn keeps its signature). The shared transport, the agents-side anthropic transport (which builds raw payloads and correctly applies the policy once), and the anthropic plugin's service-tier wrapper are untouched.

Reproduction

  1. Configure anthropic-vertex as the primary provider with prompt caching at its defaults, and enable active-memory for the agent.
  2. Drive a turn whose payload legitimately uses the full marker budget — an agentic turn where the transport marks the stable system prefix, the tool array, the last user text block, and the trailing tool_result fallback (the reporter hits this whenever a memory recall injects context).
  3. The system prompt contains the cache boundary, so the transport emits a cached stable prefix plus an uncached dynamic suffix.
  4. Before this PR: the vertex onPayload shim re-runs the payload policy on the finished payload, adds cache_control to the dynamic suffix, and Vertex StreamRawPredict returns 400 A maximum of 4 blocks with cache_control may be provided. Found 5., triggering a model fallback.
  5. After this PR: the payload reaches the wire exactly as the transport budgeted it (at most four markers, dynamic suffix uncached) and the request is accepted.

Real behavior proof

Behavior addressed (#91982): an anthropic-vertex payload that already carries the transport's full cache-control budget is no longer re-marked on its way out, so requests that previously breached Anthropic's four-marker cap with Found 5 now keep exactly the budgeted markers and the dynamic system suffix stays uncached.

Real environment tested (Linux x64, Node v22.22.3 — Vitest against the production vertex stream runtime): createAnthropicVertexStreamFn from the shipped plugin module, exercised through its injectable transport seam; the regression payload mirrors the shared transport's real output shape (split system prefix/suffix, marked tools, marked user text, marked trailing tool_result).

Exact steps or command run after this patch: pnpm test extensions/anthropic-vertex/stream-runtime.test.ts; pnpm tsgo:extensions && pnpm tsgo:extensions:test; node scripts/run-oxlint.mjs and format check on the changed files; .agents/skills/autoreview/scripts/autoreview --engine claude --thinking claude=max.

Evidence after fix (Vitest output for the touched test file):

 Test Files  1 passed (1)
      Tests  25 passed (25)

Observed result after fix: the transport options forward the caller's payload hook unchanged; a payload carrying the transport's four budgeted markers keeps exactly four after the hook path runs, with the dynamic suffix block still uncached; when the caller supplies no hook, the transport sets none.

What was not tested: a live Vertex AI StreamRawPredict round-trip against GCP (no Vertex credentials in this environment). The reporter's GCP Cloud Logging captures in #91982 document the live failure mode this removes.

Repro confirmation: both new tests fail on the unpatched tree — the marker count comes back as expected 5 to be 4, the same five-marker breach the reporter logged, and a transport-injected hook appears where none was supplied — and pass with the fix, so the tests cover the production change.

Risk / Mitigation

  • Risk: vertex payloads could lose prompt caching if nothing re-applied markers. Mitigation: the shared anthropic-messages transport applies cache-control budgeting for every request on this API family before onPayload runs, and the vertex stream function routes all requests through it — the same single-owner arrangement the plain anthropic provider already ships with.
  • Risk: a downstream onPayload consumer relied on the shim re-shaping its returned payload. Mitigation: every runtime supplier was audited (anthropic payload logging, command delivery logging, diagnostics byte accounting, and the before_provider_request plugin hook); all of them inspect or patch the finished payload and none construct boundary-marked system arrays, and the plain anthropic path likewise sends hook results without re-policing.

Change Type (select all)

  • Bug fix

Scope (select all touched areas)

  • Integrations

Linked Issue/PR

Fixes #91982

@clawsweeper

clawsweeper Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Codex review: passed. Reviewed June 12, 2026, 8:58 AM ET / 12:58 UTC.

Summary
The PR removes the Anthropic Vertex adapter’s redundant cache-control payload-policy pass, forwards caller payload hooks unchanged, and adds regressions for preserving transport-budgeted payloads.

PR surface: Source -35, Tests -11. Total -46 across 2 files.

Reproducibility: yes. at source level. Current main reapplies cache policy to a finalized, fully budgeted payload, and the linked production logs show the corresponding five-marker rejection; this review did not run a live post-fix GCP request.

Review metrics: none identified.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] A live post-fix Vertex AI request was not run, so the final provider acceptance is inferred from the removed extra mutation, focused runtime tests, and the reporter’s live pre-fix logs.

Maintainer options:

  1. Decide the mitigation before merge
    Keep the shared Anthropic request builder as the sole cache-control budget owner, with the Vertex adapter limited to client adaptation and transparent forwarding of finalized-payload hooks.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • No automated repair is needed; the clean exact-head patch can proceed through the existing automerge checks and normal owner review gates.

Security
Cleared: The patch removes local payload mutation and updates focused tests without changing dependencies, credentials, permissions, workflows, package resolution, or executable supply-chain inputs.

Review details

Best possible solution:

Keep the shared Anthropic request builder as the sole cache-control budget owner, with the Vertex adapter limited to client adaptation and transparent forwarding of finalized-payload hooks.

Do we have a high-confidence way to reproduce the issue?

Yes at source level. Current main reapplies cache policy to a finalized, fully budgeted payload, and the linked production logs show the corresponding five-marker rejection; this review did not run a live post-fix GCP request.

Is this the best way to solve the issue?

Yes. Removing the obsolete duplicate policy owner is narrower and less drift-prone than adding finalized-payload detection or another marker cap inside the Vertex plugin.

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against 1bd04ac98389.

Label changes

Label changes:

  • add status: 🚀 automerge armed: This PR is in ClawSweeper's automerge lane. Not applicable: The author is a repository member, so the external-contributor proof gate does not apply; focused production-runtime tests and linked live pre-fix GCP logs provide strong supporting evidence.
  • remove status: 👀 ready for maintainer look: Current PR status label is status: 🚀 automerge armed.

Label justifications:

  • P1: The shipped Anthropic Vertex path can reject valid memory-hit requests and force model fallback whenever the legitimate four-marker cache budget is filled.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • status: 🚀 automerge armed: This PR is in ClawSweeper's automerge lane. Not applicable: The author is a repository member, so the external-contributor proof gate does not apply; focused production-runtime tests and linked live pre-fix GCP logs provide strong supporting evidence.
Evidence reviewed

PR surface:

Source -35, Tests -11. Total -46 across 2 files.

View PR surface stats
Area Files Added Removed Net
Source 1 5 40 -35
Tests 1 56 67 -11
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 2 61 107 -46

What I checked:

Likely related people:

  • vincentkoc: Authored the original Vertex-side cache-boundary gap closer and subsequent prompt-cache stabilization work. (role: introduced behavior; confidence: high; commits: 1a13c34f5bc8, 577983172307; files: src/agents/anthropic-vertex-stream.ts)
  • steipete: Authored the native Anthropic transport cache-boundary implementation in the merged fix(models): preserve provider prompt cache boundaries #89460 and has substantial history on the Vertex provider path. (role: shared transport contributor; confidence: high; commits: eef24d452fb5, 4ca07559abe4; files: src/llm/providers/anthropic.ts, src/agents/anthropic-vertex-stream.ts)
  • sallyom: Introduced the Anthropic Vertex provider in the original merged feature commit. (role: feature introducer; confidence: medium; commits: 6e20c4baa093; files: src/agents/anthropic-vertex-stream.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added the rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. label Jun 12, 2026
@openperf

Copy link
Copy Markdown
Member Author

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. P1 High-priority user-facing bug, regression, or broken workflow. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. and removed rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels Jun 12, 2026
@clawsweeper

clawsweeper Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

🦞👀
ClawSweeper picked this up.

Command router queued. I will update this comment with the next step.

@Takhoffman

Copy link
Copy Markdown
Contributor

@clawsweeper automerge

@clawsweeper

clawsweeper Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

🦞✅
ClawSweeper merged this PR after the passing review.

Source: clawsweeper[bot]
Feedback: structured ClawSweeper verdict: pass (sha=6ef19602bfb48a819e096b23b66ece94a8072ed4)
Merge status: merged by ClawSweeper automerge
Merged at: 2026-06-12T12:59:03Z
Merge commit: 0fc5a57a3440

What merged:

  • The PR removes the Anthropic Vertex adapter’s redundant cache-control payload-policy pass, forwards caller payload hooks unchanged, and adds regressions for preserving transport-budgeted payloads.
  • PR surface: Source -35, Tests -11. Total -46 across 2 files.
  • Reproducibility: yes. at source level. Current main reapplies cache policy to a finalized, fully budgeted pa ... ion logs show the corresponding five-marker rejection; this review did not run a live post-fix GCP request.

Automerge notes:

  • No ClawSweeper repair was needed after automerge opt-in.

The automerge loop is complete.

Automerge progress:

  • 2026-06-12 12:52:41 UTC review queued 6ef19602bfb4 (queued)
  • 2026-06-12 12:58:48 UTC review passed 6ef19602bfb4 (structured ClawSweeper verdict: pass (sha=6ef19602bfb48a819e096b23b66ece94a8072...)
  • 2026-06-12 12:59:08 UTC merged 6ef19602bfb4 (merged by ClawSweeper automerge)

@clawsweeper clawsweeper Bot added clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge status: 🚀 automerge armed This PR is in ClawSweeper's automerge lane. and removed status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. labels Jun 12, 2026
@clawsweeper clawsweeper Bot merged commit 0fc5a57 into openclaw:main Jun 12, 2026
229 of 240 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clawsweeper:automerge Maintainer opted this PR into bounded ClawSweeper-reviewed automerge extensions: anthropic-vertex P1 High-priority user-facing bug, regression, or broken workflow. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. size: S status: 🚀 automerge armed This PR is in ClawSweeper's automerge lane.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: anthropic-vertex-provider adds cache_control to active-memory system block — triggers "Found 5" error when active-memory is enabled

2 participants