Skip to content

[Bug]: anthropic-vertex-provider adds cache_control to active-memory system block — triggers "Found 5" error when active-memory is enabled #91982

@danieljimz

Description

@danieljimz

Environment

  • OpenClaw version: 2026.6.5
  • Plugin: @openclaw/anthropic-vertex-provider (same version)
  • Platform: GCE, Debian 12, Node 24
  • Provider: anthropic-vertex/claude-haiku-4-5 (primary), claude-sonnet-4-6 (fallback)

Summary

When active-memory is enabled for an agent using anthropic-vertex-provider,
the gateway intermittently rejects requests with:

FailoverError: LLM request rejected:
A maximum of 4 blocks with cache_control may be provided. Found 5.

The error is intermittent — it only triggers when active-memory finds relevant
memories and injects them. When no memories are found, the request succeeds.

Root cause (traced in source)

applyAnthropicCacheControlToSystem in src/agents/anthropic-payload-policy.ts
iterates over the system block array and adds cache_control: ephemeral to any
block that has no cache_control and no OPENCLAW_CACHE_BOUNDARY marker in its
text.

The native Anthropic provider (buildAnthropicSystemBlocks) correctly splits
the system prompt into a stable prefix (with CC) and a date/context dynamic
suffix (without CC). active-memory then injects recalled memories as
prependContext into the user message, producing a second user content block
that also carries cache_control.

When the vertex plugin runs applyAnthropicCacheControlToSystem on the already-
split system array, it sees the dynamic suffix block (no boundary marker, no CC)
and adds CC to it. This produces:

# Source Block
1 system stable prefix (CC — correct)
2 system dynamic suffix (CC — added erroneously by vertex plugin)
3 tools last tool definition (CC)
4 messages prependContext block in user message (CC)
5 messages actual user message block (CC)

Total: 5 → rejected by Anthropic API.

Without active-memory (no prependContext → single user message block → 1 CC
from messages), the erroneous system CC still exists but total stays at 4 and
requests succeed.

Steps to reproduce

  1. Configure anthropic-vertex-provider as primary model for an agent
  2. Enable active-memory for that agent in openclaw.json
  3. Ensure the agent has stored memories that match the user's query
  4. Send a message that triggers a memory recall hit (summaryChars > 0)
  5. The StreamRawPredict call to Vertex AI returns 400:
    {"type":"error","error":{"type":"invalid_request_error","message":"A maximum of 4 blocks with cache_control may be provided. Found 5."}}

Without a memory hit (empty recall), the same message succeeds.

GCP Cloud Logging evidence (real production errors)

2026-06-10T14:37:25Z  StreamRawPredict  claude-haiku-4-5
{"type":"error","error":{"type":"invalid_request_error","message":"A maximum of 4 blocks with cache_control may be provided. Found 5."},"request_id":"req_vrtx_011CbukPQbsjAGaKJVqrQhsv"}

2026-06-10T14:37:41Z  model_fallback_decision  candidate_succeeded
fallbackStepFromModel: anthropic-vertex/claude-haiku-4-5
fallbackStepToModel: anthropic-vertex/claude-sonnet-4-6

4 occurrences in a 3-hour window (13:07, 13:52, 14:05, 14:37 UTC 2026-06-10).

Suggested fix

applyAnthropicCacheControlToSystem should check whether the system array was
already processed by buildAnthropicSystemBlocks (i.e., whether any block
already carries cache_control) and skip adding CC to remaining blocks.
Alternatively, cap CC additions so the system array contributes at most 1
cache_control block total.

Workaround

None that preserves active-memory functionality. Disabling active-memory
eliminates the extra block and the error disappears.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions