Skip to content

fix(anthropic): strip all thinking blocks from historical messages#88941

Closed
bladin wants to merge 1 commit into
openclaw:mainfrom
bladin:fix/anthropic-thinking-expired-signatures-88932
Closed

fix(anthropic): strip all thinking blocks from historical messages#88941
bladin wants to merge 1 commit into
openclaw:mainfrom
bladin:fix/anthropic-thinking-expired-signatures-88932

Conversation

@bladin

@bladin bladin commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #88932 — Anthropic thinking blocks with expired signatures cause crash loop on replay.

Anthropic thinking signatures are time-limited and single-use. When historical thinking blocks with expired signatures are replayed to the Anthropic Messages API, the API rejects them with Invalid signature in thinking block errors, causing sessions to brick until manual reset.

Changes

  • src/llm/providers/transform-messages.ts: Strip ALL thinking blocks from historical assistant messages, regardless of signature presence. The current-turn thinking block is preserved by the stream handler, not by transcript replay.
  • src/llm/providers/anthropic.test.ts: Update test to reflect new behavior — expects empty content instead of preserved thinking blocks.

Why this fix

The previous logic preserved thinking blocks with signatures for same-model replay. However, Anthropic signatures expire and become invalid, causing API errors. There is no valid reason to replay thinking blocks from earlier turns — they are intermediate reasoning state, not user-visible content.

Risk checklist

  • User-visible behavior changed: Yes — historical thinking blocks no longer replay. This prevents crash loops and session bricking.
  • Config/environment/migration changed: No.
  • Security/auth/network changed: No.
  • Highest-risk area: Over-stripping thinking blocks that should persist. Mitigated by preserving current-turn thinking via stream handler (unchanged).

Tests and validation

  • Code compiles successfully with bun build
  • Test updated to match new behavior
  • No other tests affected (other provider tests use different code paths)

Real behavior proof

This fix prevents the following error cascade described in #88932:

  1. Extended thinking blocks accumulate time-limited signatures
  2. Signatures expire after container restart or time passage
  3. Next LLM call fails: messages.X.content.Y: Invalid signature in thinking block
  4. All subsequent calls fail — session is bricked
  5. Only recovery: manual session transcript deletion

After this fix, thinking blocks are stripped before provider conversion, preventing the error entirely.

Strip ALL thinking blocks from historical assistant messages to avoid
"Invalid signature in thinking block" errors when Anthropic thinking
signatures expire.

Anthropic thinking signatures are time-limited and single-use. Replaying
them causes API errors that brick sessions until manual reset. The
current-turn thinking block is preserved by the stream handler, not by
transcript replay.

Fixes openclaw#88932
@openclaw-barnacle openclaw-barnacle Bot added size: XS triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels Jun 1, 2026
@clawsweeper

clawsweeper Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed June 1, 2026, 2:06 AM ET / 06:06 UTC.

Summary
The branch changes the shared message transformer to drop non-redacted thinking blocks and updates the Anthropic provider test to expect an empty assistant content payload.

PR surface: Source -10, Tests -11. Total -21 across 2 files.

Reproducibility: no. high-confidence live reproduction was established. Source inspection shows current main can replay same-model signed thinking blocks, and the linked report describes Anthropic invalid-signature failures, but this review did not run a live Anthropic replay and current recovery may already retry some failures.

Review metrics: 1 noteworthy metric.

  • Shared replay helper touched: 1 helper used by 5 provider converters. The Anthropic-specific bug fix currently changes a provider-agnostic path that feeds Anthropic, Google, OpenAI Responses, Mistral, and OpenAI Completions.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🧂 unranked krab
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Scope the code change to Anthropic replay/conversion or recovery instead of the shared transform helper.
  • [P1] Add or preserve regression coverage showing OpenAI Responses and Google same-model thinking replay still preserve signatures.
  • [P1] Add redacted after-fix Anthropic runtime output, logs, or captured payload proof; after updating the PR body, ClawSweeper should re-review automatically or a maintainer can comment @clawsweeper re-review.

Proof guidance:

  • [P1] Needs real behavior proof before merge: The PR body gives a failure narrative and a build claim, but no redacted after-fix Anthropic run output, logs, terminal screenshot, or captured request payload. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

  • [P1] Merging this as written would intentionally remove signed/replayable thinking context for providers beyond Anthropic, including OpenAI Responses reasoning items and Google thought signatures.
  • [P1] The PR body describes the failure cascade but does not provide redacted after-fix output from a real Anthropic replay or captured request payload.

Maintainer options:

  1. Scope the fix to Anthropic replay (recommended)
    Move the stripping decision into Anthropic replay policy, recovery, or conversion so OpenAI/Google/Mistral signed reasoning replay remains intact.
  2. Accept global thinking loss explicitly
    Maintainers could choose to drop all historical thinking globally, but that needs explicit approval plus tests for every affected provider contract.
  3. Pause until proof is added
    Keep the PR open but unmerged until the branch includes a provider-scoped fix and redacted real Anthropic replay proof.

Next step before merge

  • [P1] This needs contributor or maintainer revision because the code repair must be provider-scoped and the contributor must add real Anthropic proof; automation should not repair a proof-missing external PR.

Security
Cleared: The supplied diff only changes provider message transformation and a test; no concrete security or supply-chain concern was found.

Review findings

  • [P1] Keep thinking replay scoped to Anthropic — src/llm/providers/transform-messages.ts:115
Review details

Best possible solution:

Keep the shared transform replay semantics intact, then implement an Anthropic-scoped replay policy, recovery, or converter repair with regression tests proving other providers still preserve their signed reasoning context.

Do we have a high-confidence way to reproduce the issue?

No high-confidence live reproduction was established. Source inspection shows current main can replay same-model signed thinking blocks, and the linked report describes Anthropic invalid-signature failures, but this review did not run a live Anthropic replay and current recovery may already retry some failures.

Is this the best way to solve the issue?

No, this is not the best fix as written. The repair should stay Anthropic-scoped instead of changing transformMessages for every provider that depends on signed reasoning replay.

Full review comments:

  • [P1] Keep thinking replay scoped to Anthropic — src/llm/providers/transform-messages.ts:115
    This helper feeds more than Anthropic. Returning [] for every non-redacted thinking block removes OpenAI Responses reasoning replay items before openai-responses-shared.ts can parse their thinkingSignature, and it also breaks current Google tests that require same-model thought signatures to survive. Please move this to an Anthropic-specific replay/conversion path or policy so other providers keep their replay contracts.
    Confidence: 0.92

Overall correctness: patch is incorrect
Overall confidence: 0.9

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 44cad6f8a48e.

Label changes

Label changes:

  • add P1: The linked failure mode can brick long-running Anthropic sessions, and the current PR also risks breaking cross-provider replay behavior.
  • add merge-risk: 🚨 compatibility: The diff changes a shared provider replay contract and would remove behavior that current OpenAI and Google tests expect.
  • add merge-risk: 🚨 session-state: Dropping stored reasoning/thought blocks changes how existing session history is replayed across turns.
  • add rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🧂 unranked krab.
  • add status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body gives a failure narrative and a build claim, but no redacted after-fix Anthropic run output, logs, terminal screenshot, or captured request payload. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Label justifications:

  • P1: The linked failure mode can brick long-running Anthropic sessions, and the current PR also risks breaking cross-provider replay behavior.
  • merge-risk: 🚨 compatibility: The diff changes a shared provider replay contract and would remove behavior that current OpenAI and Google tests expect.
  • merge-risk: 🚨 session-state: Dropping stored reasoning/thought blocks changes how existing session history is replayed across turns.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🧂 unranked krab.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body gives a failure narrative and a build claim, but no redacted after-fix Anthropic run output, logs, terminal screenshot, or captured request payload. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
Evidence reviewed

PR surface:

Source -10, Tests -11. Total -21 across 2 files.

View PR surface stats
Area Files Added Removed Net
Source 1 6 16 -10
Tests 1 2 13 -11
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 2 8 29 -21

Acceptance criteria:

  • [P1] node scripts/run-vitest.mjs src/llm/providers/anthropic.test.ts src/llm/providers/openai-responses-shared.test.ts src/llm/providers/openai-compatible-auth.test.ts src/llm/providers/openai-chatgpt-responses.test.ts src/llm/providers/google-shared.convert.test.ts.

What I checked:

  • PR diff changes shared replay behavior: The supplied PR hunk replaces the shared non-redacted thinking-block handling with an unconditional empty result, so the change applies before provider-specific conversion, not only to Anthropic. (src/llm/providers/transform-messages.ts:115, df763b31101d)
  • Current shared helper preserves signed same-model thinking: Current main keeps same-model thinking blocks with signatures in transformMessages, including the comment that this is needed for replay and OpenAI encrypted reasoning. (src/llm/providers/transform-messages.ts:110, 44cad6f8a48e)
  • Shared call surface: transformMessages is used by Anthropic, Google, OpenAI Responses, Mistral, and OpenAI Completions provider converters, so a blanket change has cross-provider blast radius. (src/llm/providers/anthropic.ts:1054, 44cad6f8a48e)
  • OpenAI Responses depends on thinkingSignature replay: OpenAI Responses conversion parses a thinking block's thinkingSignature into a replayable reasoning item; stripping the block earlier removes that item from the request. (src/llm/providers/openai-responses-shared.ts:265, 44cad6f8a48e)
  • Adjacent tests protect non-Anthropic thinking replay: Current tests expect OpenAI Responses reasoning items and Google same-model thought signatures to survive replay; the proposed shared return [] would remove those blocks before the assertions can pass. (src/llm/providers/google-shared.convert.test.ts:181, 44cad6f8a48e)
  • Existing Anthropic recovery path: Current agent runtime already wraps Anthropic replay with one retry that strips all thinking blocks after matching invalid thinking/signature errors, which is a safer provider-scoped place to repair this behavior. (src/agents/embedded-agent-runner/thinking.ts:532, 44cad6f8a48e)

Likely related people:

  • Peter Steinberger: git blame attributes the current shared transform and Anthropic replay test lines to Peter, and recent transcript/provider replay history also includes centralization work in this area. (role: current-line owner and recent area contributor; confidence: medium; commits: e2c9c06de1c3, ef2541ceb335; files: src/llm/providers/transform-messages.ts, src/llm/providers/anthropic.test.ts, src/agents/transcript-policy.ts)
  • Qinyao He: Commit 7a35146 added the model-version policy that preserves thinking blocks for newer Claude models to protect prompt-cache behavior. (role: introduced current Claude 4.5+/4.6 preservation policy; confidence: high; commits: 7a3514664ddb; files: src/plugins/provider-replay-helpers.ts, src/agents/transcript-policy.ts)
  • Frank Yang: Commit 5ca0233 changed transcript policy to drop Anthropic thinking blocks on replay for older/problematic cases. (role: introduced earlier Anthropic drop-thinking replay behavior; confidence: high; commits: 5ca0233db05f; files: src/agents/transcript-policy.ts)
  • Ayaan Zaidi: Commit c65e152 added preservation-focused Anthropic replay regressions and behavior in the predecessor runner paths. (role: earlier Anthropic replay contributor; confidence: medium; commits: c65e152b390a; files: src/agents/pi-embedded-runner/thinking.ts, src/agents/pi-embedded-runner/thinking.test.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. labels Jun 1, 2026
@steipete

steipete commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Closing as superseded by current main, not landing this PR.

The linked issue #88932 is already closed because current main has the narrower fix: it classifies invalid or expired Anthropic thinking signatures as replay_invalid, strips invalid thinking on retry/sanitization, and still preserves active signed thinking where Anthropic/Bedrock tool-turn replay requires it. This PR's blanket stripping in src/llm/providers/transform-messages.ts would be broader than the current fix and can regress supported signed-thinking replay cases.

Verification I ran on current main plus maintainer branch commit 014ee3750d:

node scripts/run-vitest.mjs src/agents/embedded-agent-helpers.isbillingerrormessage.test.ts src/agents/embedded-agent-runner.sanitize-session-history.test.ts -- --reporter=verbose -t "invalid signature|replay invalid|thinking signatures"

Result: passed 2 Vitest shards; 1 classifier test and 9 sanitizer/replay tests passed.

Thanks @bladin for jumping on the report quickly; the current-main fix is the safer shape here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P1 High-priority user-facing bug, regression, or broken workflow. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: XS status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Anthropic thinking blocks with expired signatures cause crash loop on replay

2 participants