Skip to content

Ungraceful connection termination (e.g. credit exhaustion) leaves thinking block with empty signature, poisoning the conversation #3936

@filmackay

Description

@filmackay

Summary

When an Anthropic API stream is terminated by a silent TCP connection drop (rather than a graceful SSE error event — e.g. when account credits run out mid-stream), Pi stores an assistant message with stopReason: "stop" (the default, never updated) containing a thinking block that has content but an empty thinkingSignature (the signature arrives via signature_delta events that never arrived). This message is then replayed verbatim on the next user prompt, causing the Anthropic API to reject the request with a 400 error.

Steps to Reproduce

  1. Start a Pi session with a thinking-enabled Claude model (e.g. claude-sonnet-4-5).
  2. Send a prompt that produces a long thinking block.
  3. While the thinking block is streaming, have the API connection drop silently (e.g. account credits run out causing Anthropic to drop the TCP connection without sending an SSE error event).
  4. Without restarting Pi, type any follow-up message (e.g. "continue").

Expected Behaviour

Pi detects the incomplete response and either:

  • Marks the message as stopReason: "error" so transformMessages skips it on replay, OR
  • Strips the unsignable thinking block before sending the follow-up request.

The user can continue the conversation normally.

Actual Behaviour

The 400 error is returned immediately:

Error: 400 {"type":"error","error":{"type":"invalid_request_error",
"message":"messages.3.content.1: thinking or redacted_thinking blocks in the
latest assistant message cannot be modified. These blocks must remain as they
were in the original response."}}

The session is now stuck — every subsequent message triggers the same error.

Root Cause

Two bugs combine:

1. Silent TCP drop is not detected as an error

In anthropic.js, streamAnthropic initialises the output with stopReason: "stop". When the underlying SSE iterator ends cleanly (no exception thrown, no SSE error event), the stopReason is never updated. The code in agent-loop.js that skips errored messages only fires on "error" or "aborted":

// agent-loop.js
if (message.stopReason === "error" || message.stopReason === "aborted") {
    // ... skip message, stop loop
}

So a silently-dropped stream produces a message with stopReason: "stop" that looks like a successful (if empty) response.

2. Thinking blocks with content but no signature are silently converted to text

In anthropic.js convertMessages():

if (!block.thinkingSignature || block.thinkingSignature.trim().length === 0) {
    // Converts thinking → plain text block
    blocks.push({ type: "text", text: sanitizeSurrogates(block.thinking) });
}

Because stopReason was "stop", transformMessages does not skip this message. The thinking block (with content but no signature) is then converted to a text block. Anthropic's API sees that content[N] changed from a thinking block to a text block and rejects the request.

Potential Fix?

After the stream ends, validate thinking blocks before finalising the message. If any thinking block has content but no signature, the stream was cut prematurely — treat it as an error:

// In streamAnthropic, after the event loop ends:
for (const block of output.content) {
    if (
        block.type === "thinking" &&
        !block.redacted &&
        block.thinking.length > 0 &&
        (!block.thinkingSignature || block.thinkingSignature.trim().length === 0)
    ) {
        output.stopReason = "error";
        output.errorMessage = "Stream ended before thinking signature was received (likely a silent connection drop)";
        break;
    }
}

This makes the message eligible for the existing transformMessages skip logic, and the existing _handleRetryableError / state-cleanup paths take care of the rest.

Additional Notes

  • The only workaround is to /tree back to the last clean user message, or manual JSONL surgery on the session file.
  • A related but separate bug: for non-retryable, non-overflow errors (like credit exhaustion), the errored assistant message is not removed from agent.state.messages after agent_end. Only retryable errors (_handleRetryableError) and overflow compaction remove it. This means the malformed message is included in the context snapshot passed to the next agent.prompt() call. While transformMessages should ultimately skip it, the state is misleading and could cause issues in other edge cases.

Environment

  • Pi version: 0.70.6
  • Provider: Anthropic (direct)
  • Model: claude-sonnet-4-5 (thinking enabled)

Metadata

Metadata

Assignees

No one assigned

    Labels

    inprogressIssue is being worked on

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions