Skip to content

[Bug]: Restart recovery can finish with payloads=0 while Control UI only shows tool blocks and no visible error #77883

@qiaosang00

Description

@qiaosang00

Summary

Gateway restart recovery can resume an interrupted main session, execute tool work, then finish with stopReason=stop payloads=0. The backend/runner detects the incomplete turn, but the Control UI can still leave the user with only collapsed Tool call / Tool output rows and no visible assistant text or clear error in the chat body.

This makes the session look like it is still mid-turn or silently missing a final answer, even though the task has already terminally completed.

Environment

  • OpenClaw: 2026.5.3 (2484f37)
  • OS: Windows 10.0.26200
  • Node: 22.22.0
  • Gateway: local loopback, ws://127.0.0.1:18789
  • Surface: Control UI / webchat
  • Provider: custom OpenAI-compatible Chat Completions backend through cockpit-codex/gpt-5.5
  • Relevant configured channels: Feishu + openclaw-weixin

Repro shape

  1. Run a main Control UI session with an OpenAI-compatible provider.
  2. Start a turn that performs several tools.
  3. Apply a config change requiring Gateway restart while work is active. In this case it was a channels.feishu config update.
  4. Gateway logs restart deferral while active work exists, then eventually forces restart.
  5. On startup, main-session restart recovery resumes the interrupted session.
  6. The recovery run reaches terminal state but produces no user-visible assistant payload.
  7. Control UI shows a stack of collapsed Tool call / Tool output blocks, but no visible final answer/error text below them.

Observed logs

config change requires gateway restart (channels.feishu) — deferring until 2 operation(s), 1 embedded run(s) complete
restart timeout after 300290ms with 2 operation(s), 1 embedded run(s) still active; forcing restart
received SIGUSR1; restarting
draining 2 active task(s) and 1 active embedded run(s) before restart with timeout 300000ms

After manual service recovery:

resumed interrupted main session: agent:main:main
main-session restart recovery complete: recovered=1 failed=0 skipped=0
incomplete turn detected: runId=62f1897c-011b-4760-8823-4924f54918ea sessionId=009119ee-4c88-468a-8a71-0c6e57e2907f stopReason=stop payloads=0 — surfacing error to user

The durable task for the recovery run was marked succeeded, not running or queued, so this was not an active in-progress turn anymore. A separate smoke run on the same provider immediately afterward returned visible text successfully, so the Gateway/provider were not globally down.

Expected behavior

When restart recovery resumes an interrupted session and the resumed turn terminally produces no visible assistant text, the Control UI should make the terminal state obvious. For example:

  • show the existing incomplete-turn error payload visibly in the chat body, or
  • render a clear terminal error row tied to the run, or
  • mark the resumed turn as failed instead of looking like tool-only progress with no conclusion.

The user should not have to inspect logs/tasks to discover that the resumed turn is already over and failed to produce a visible answer.

Actual behavior

The Control UI showed only a vertical sequence of collapsed Tool call / Tool output rows. There was no visible final assistant reply and no obvious error text after the tools. From the user's perspective, the session looked like it had simply stopped with no explanation.

Why this seems separate from the provider-level payloads=0 bugs

There are existing issues for providers returning empty user-visible content, for example:

This report is narrower: even if the provider/backend returns an empty terminal turn, restart recovery + Control UI should still surface an understandable terminal state. The current behavior makes a recovered interrupted session look like a silent UI stall.

Related restart/session work:

Code paths likely involved

  • src/agents/main-session-restart-recovery.ts builds and dispatches the resume turn.
  • src/agents/pi-embedded-runner/run.ts detects incomplete turn detected ... payloads=0 and creates an error payload.
  • Control UI/webchat rendering or lifecycle finalization may be dropping, hiding, or failing to attach that error payload after the tool rows.

Impact

Medium. The Gateway can be healthy and the provider can work on subsequent turns, but the recovered session gives no clear user-visible conclusion. This is especially confusing after automatic restart recovery because users reasonably expect OpenClaw to either continue the original task or explicitly say recovery failed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    staleMarked as stale due to inactivity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions