Skip to content

[Bug]: Agent stalls indefinitely when model emits stopReason="stop" with no toolCall — only thinking block generated #89787

@ArthurusDent

Description

@ArthurusDent

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

When the Qwen3.6-35B-A3B model terminates generation with stopReason="stop" inside a thinking block without emitting any toolCall, the embedded agent enters an infinite silent stall — no tools execute, no JSONL entries are appended, no gateway logs fire.

Steps to reproduce

  1. Run an embedded agent via Matrix direct chat with context window ~72k and a tool-rich system prompt (file I/O, subprocess management).
  2. Let the agent accumulate tool output over multiple turns (~39 toolUse cycles) until context pressure increases toward the overflow threshold (~45k+ tokens).
  3. Trigger a turn where the model enters extended thinking (>250 chars of internal reasoning) while planning to execute tools.
  4. Observe: the model generates stopReason="stop" without any subsequent toolCall token stream — only a thinking block containing embedded <function=...> XML as plain text. No JSONL entries are appended after this, no gateway logs fire, and the session freezes indefinitely.

Reproducible in session aa55eb34-e588-439d-a780-359d5e0de27c (1 June 2026), occurring 3 independent times out of 55 assistant entries over ~1h 24m.

Expected behavior

If an assistant response contains only a thinking block with no toolCall (regardless of stopReason), the gateway should detect this as an incomplete turn and either:

  • Auto-retry with a corrective prompt, or
  • Surface a clear error state with logging (livenessState=abandoned)

Actual behavior

Session remains silently idle — resolved only by user intervention (~64s for short thinking blocks), auto-compaction after context overflow (~191s for longer blocks), or never (permanent stall, no recovery path). No gateway log entries appear after the thinking-only JSONL entry until resolution occurs (or ever, in permanent stalls). Session appears healthy in dashboards despite being completely frozen.

OpenClaw version

2026.5.28

Operating system

Ubuntu Server 24.04 LTS (kernel 6.8.0-124-generic x64)

Install method

git clone — updated via cd ~/openclaw && git pull && git fetch origin --tags && git checkout <tag> && pnpm build && openclaw gateway start/stop

Model

llama/Qwen3.6-35B-A3B-UD-MTP-Q4_K_M (Qwen3.6-35B-A3B)

Provider / routing chain

OpenClaw → local llama.cpp

Additional provider/model setup details

Context window: 72k, reserveTokensFloor: 20000. Channel: Matrix direct chat. Model runs locally via llama.cpp with Qwen3.6-35B-A3B in Q4_K_M quantization. The model has a deterministic failure mode where extended thinking blocks (>250 chars) can trigger premature EOS (stopReason="stop") without emitting the toolUse token stream, even when the reasoning text contains explicit plans to execute tools with embedded <function=...> XML syntax as plain text — not structured tool calls.

This is distinct from truncated-response hangs: the model hits its own clean EOS, so truncation detectors do not trigger. The response appears structurally valid to the gateway (clean stopReason + text content → treated as completed turn).

Logs, screenshots, and evidence

Stall entries in session aa55eb34 (55 assistant entries / 124 JSONL lines):

| Entry | JSONL Line | Timestamp (UTC) | stopReason | Has toolCall? | Thinking length | Resolution |
|-------|------------|-----------------|------------|---------------|-----------------|------------|
| [13]  | 14         | 2026-06-01T14:11:29.359Z | "stop" | ❌ No | 275 chars (<function=read>) | Resolved ~64s by user message "setze fort" |
| [53]  | 54         | 2026-06-01T15:20:10.806Z | "stop" | ❌ No | 586 chars (<function=exec>) | Resolved by auto-compaction (~191s after overflow precheck) |
| [123] | 124        | 2026-06-01T15:32:51.707Z | "stop" | ❌ No | 264 chars (<function=process>) | Permanent stall — no further JSONL entries, never resolved |

(Note: Entries [62] and [102] are gateway re-serialization artifacts from compaction of Entries [13] and [53], not independent inference runs.)

Gateway log — context-overflow-precheck triggered at 17:20:40 CEST:
  compactionAttempts=0 (first ever in this session)
  estimatedPromptTokens=65345, promptBudgetBeforeReserve=51680, overflowTokens=13665

Gateway log — auto-compaction completed at 17:23:51 CEST (191s gap):
  auto-compaction succeeded; retrying prompt (truncated 9 tool result(s))

llama.cpp slot release at 17:20:10 CEST:
  slot release: id 0 | task 90849 | stop processing: n_tokens=45923
  ~1.254s inference duration — consistent with normal completion of a long thinking block, not an inference stall.

Entry [123] at 17:32:51 CEST: no gateway log activity after this point. Context had decreased after compaction, so no overflow was detected and no recovery was triggered.

Evidence available: session JSONL trajectory (aa55eb34-e588-439d-a780-359d5e0de27c.jsonl, 124 NDJSON lines), gateway journalctl logs with context-overflow-precheck entries, llama.cpp slot release logs.

Impact and severity

Affected: Embedded agents using models prone to the stopReason="stop" without toolCall failure mode (observed with Qwen3.6-35B-A3B)
Severity: Critical — sessions freeze permanently with no automatic recovery, zero error logging, no user-visible indication of failure
Frequency: Deterministic under context pressure (>~40k tokens accumulated in session); observed 3 independent occurrences out of 55 assistant turns (~5.5%) in session aa55eb34
Consequence: Agent stops responding; manual re-triggering required. For automated/long-running workflows, this means silent data loss or state drift.

Additional information

This is a second failure pattern distinct from the truncation bug reported in #89051 (and addressed by PR #89160). That bug covers truncated API responses where stopReason is missing/length — resolveIncompleteTurnPayloadText was silently returning success. The present bug covers clean EOS (stopReason="stop") with no toolCall emitted. Both produce identical observable behavior but require different gateway-layer fixes.

Root cause hypothesis:

  • Model layer: Extended thinking blocks trigger premature EOS without emitting the toolUse token stream
  • Gateway layer: No validation for stopReason="stop" + no toolCall case; the response appears structurally valid (has clean stopReason + text content → treated as completed turn)

Related: #89051 (original silent stall report), #87692 (silent abort without error log), PR #89160 (truncation path fix)

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.bugSomething isn't workingbug:behaviorIncorrect behavior without a crashclawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions