Skip to content

[Bug]: Session stuck in "running" status persists in v2026.4.9 — phaseBeforeAbort fix no longer sufficient #63819

@irvinheard

Description

@irvinheard

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

In v2026.4.9, sessions get stuck in status: running after any request timeout, even after the phaseBeforeAbort fix from #14228 is confirmed applied (phaseBeforeAbort:0, clearState:5 in runs-D-shWMaO.js).

Steps to reproduce

  1. Run local llama.cpp server on port 8110 via Vulkan/MoltenVK
  2. Connect Telegram through OpenClaw
  3. Send a message triggering a long task
  4. Task times out mid-execution
  5. Session stays status: running — all subsequent messages fail

Expected behavior

Session returns to idle after abort, as in 2026.4.8 with phaseBeforeAbort fix applied.

Actual behavior

Session stays stuck in status: running. Watchdog clears it every ~2-3 minutes but it immediately gets stuck again on the next request.

OpenClaw version

2026.4.9 (0512059)

Operating system

macOS Darwin 25.4.0 x64

Install method

npm global

Model

qwen3-5-27b-8110/qwen3.5-27b (local llama.cpp Vulkan)

Provider / routing chain

openclaw → llama-server 127.0.0.1:8110/v1

Additional provider/model setup details

phaseBeforeAbort fix confirmed in both runs-D-shWMaO.js and pi-embedded-Vw-lS5ti.js. Fix worked in 4.8, broke in 4.9 suggesting root cause moved to a new code path. Related: #14228, #9405, #57617

Logs, screenshots, and evidence

[2026-04-09 10:39:24] Cleared stuck session: agent:main:telegram:direct:8317843287 (stuck 175s)
[2026-04-09 10:49:25] Cleared stuck session: agent:main:telegram:direct:8317843287 (stuck 158s)
[2026-04-09 10:53:26] Cleared stuck session: agent:main:telegram:direct:8317843287 (stuck 159s)
2026-04-09T14:32:36 embedded_run_agent_end isError:true error:"LLM request failed: network connection error" failoverReason:timeout

Impact and severity

High — blocks all messages after any timeout. Frequency: every ~10 min. Workaround: launchd session watchdog clears stuck sessions every 60s.

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.bugSomething isn't workingclawsweeper:fix-shape-clearClawSweeper found a clear likely implementation shape for this issue.clawsweeper:queueable-fixClawSweeper marked this issue as an existing queue_fix_pr work candidate.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions