fix(agents): classify expired thinking signatures#88072
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Codex review: needs maintainer review before merge. Reviewed May 29, 2026, 12:38 PM ET / 16:38 UTC. Summary PR surface: Source 0, Tests +14. Total +14 across 2 files. Reproducibility: yes. Current main lacks the signature/thinking-block replay-invalid match, and the linked issue provides the exact Anthropic payload that falls through the current classifier. Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land the focused classifier and regression-test patch once CI is acceptable; request live Anthropic retry proof only if maintainers need end-to-end expiry assurance. Do we have a high-confidence way to reproduce the issue? Yes. Current main lacks the signature/thinking-block replay-invalid match, and the linked issue provides the exact Anthropic payload that falls through the current classifier. Is this the best way to solve the issue? Yes. Extending the existing replay-invalid classifier with a narrow signature plus thinking-block match, while preserving the generic Invalid signature negative case, is the smallest maintainable fix path. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against dc7bd4abf556. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source 0, Tests +14. Total +14 across 2 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
|
@clawsweeper visualize |
|
🦞👀 I queued a read-only visual pass. It will create or update one marker-backed visual brief comment and will not trigger close, merge, repair, label, or branch changes. Lens: |
|
Source: #88072 (comment) Visual briefLens: flow Legend: ✅ expected/proven; ❌ broken path; Maintainer rulingBenefit: Routes the reported Anthropic expired-thinking-signature rejection into the existing replay recovery path. |
|
@clawsweeper automerge |
|
🦞🔧 Source: I will update this PR branch, or open a safe credited replacement, if the repair worker finds a narrow CI fix. Automerge progress:
|
|
ClawSweeper 🐠 reef update Thanks for the work here. ClawSweeper could not write to the source branch, so it opened a replacement PR rather than letting the fix drift. attribution still points back here. Why replacement: ClawSweeper could not update the source PR branch directly; GitHub did not grant sufficient push rights to the bot for that branch.
fish notes: model gpt-5.5, reasoning high; reviewed against 57c80d9. |
Summary
Classify Anthropic expired thinking-signature rejections as
replay_invalidso the existing recovery path can strip stale thinking blocks and retry.Invalid signature in thinking blockpatterns to the replay-invalid classifier.Invalid signatureerrors out of replay recovery.Linked context
Closes #88020
Real behavior proof (required for external PRs)
Invalid signature in thinking blockerrors are now classified asreplay_invalidinstead ofunclassified, allowing the existing stale-thinking replay recovery path to run.origin/mainbase, Node with the repotsxloader, direct classifier import fromsrc/agents/embedded-agent-helpers/errors.ts.replay_invalidpath, while a generic invalid-signature message remains unclassified.Tests and validation
node scripts/run-vitest.mjs src/agents/embedded-agent-helpers.isbillingerrormessage.test.tspnpm exec oxfmt --check --threads=1 src/agents/embedded-agent-helpers/errors.ts src/agents/embedded-agent-helpers.isbillingerrormessage.test.tsnode scripts/run-oxlint.mjs --tsconfig config/tsconfig/oxlint.core.json src/agents/embedded-agent-helpers/errors.ts src/agents/embedded-agent-helpers.isbillingerrormessage.test.tspnpm changed:lanes --jsonpnpm check:changedgit diff --checkRegression coverage added:
invalid_request_errorwithInvalid signature in thinking blockclassifies asreplay_invalid.ValidationException: invalid signature on thinking blockclassifies asreplay_invalid.Invalid signaturedoes not classify asreplay_invalid.Risk checklist
Did user-visible behavior change?
YesDid config, environment, or migration behavior change?
NoDid security, auth, secrets, network, or tool execution behavior change?
NoHighest-risk area: Provider runtime failure classification.
Mitigation: The new match requires both signature and thinking-block language, and the test proves generic invalid-signature errors do not enter replay recovery.
Current review state
Next action: Maintainer review.
Waiting on: CI and any maintainer request for live Anthropic expiry proof.
Bot or reviewer comments addressed: None yet.