Skip to content

[Bug]: Empty assistant delivery (footer-only, "— out" usage) on first turn after /new on Telegram-routed topic lane — 2026.5.27 #87711

@BryceMurray

Description

@BryceMurray

Environment

  • OpenClaw: 2026.5.27
  • Backend: gateway with api: "openai-responses" transport (per session record api field)
  • Channel: Telegram (group + forum topic routing)
  • Agent: single agents.list[main], model claude-opus-4-7
  • Host: single-VPS Linux deployment, no container

TL;DR

After running /new on a Telegram group-topic lane, the first follow-up user message produces an empty assistant delivery — only the channel footer renders in the topic, with a — out em-dash where the output token count should be. The session jsonl contains exactly one record for the turn: an assistant message with empty text, usage zeros across the board, and stopReason: "stop". End-user latency from user message to footer-only delivery was ~12 minutes (single observation).

The /new itself completed normally (prior session archived to .jsonl.reset.<iso>, fresh session bound under the same lane key, ✅ New session started. posted to the channel). It is the next turn that silently truncates.

Symptom

Channel (Telegram topic) view, in order:

  1. User: /new@<bot>
  2. Agent: ✅ New session started. (immediate)
  3. User: <message referencing prior conversation context> (e.g. "Continue with TASK-009")
  4. ~12-minute gap with no visible activity
  5. Agent delivery: empty body + standard footer line:
    ──────────
    🧠 opus-4-7 · 2% new · 99% cache_hit · 21k in / — out
    

The — out em-dash (vs 0 out) is the channel renderer's signal that the output token count was never finalized — distinct from an intentional zero-token response.

Session-jsonl evidence

The post-reset session file for the affected lane contains exactly two records: the session header, then one assistant record. Full assistant payload (UUIDs redacted):

{
  "type": "message",
  "id": "<uuid>",
  "parentId": null,
  "timestamp": "<ts>",
  "message": {
    "role": "assistant",
    "content": [
      {
        "type": "text",
        "text": "──────────\n🧠 opus-4-7 · 2% new · 99% cache_hit · 21k in / — out"
      }
    ],
    "api": "openai-responses",
    "provider": "openclaw",
    "model": "delivery-mirror",
    "usage": {
      "input": 0, "output": 0, "total": 0,
      "prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0,
      "cache": { "read": 0, "write": 0, "cacheRead": 0, "cacheWrite": 0, "total": 0 }
    },
    "stopReason": "stop",
    "timestamp": <ms>
  }
}

Notable:

  • model: "delivery-mirror" — this is the channel-side stub the gateway writes when mirroring a turn to the topic. The model that actually ran was claude-opus-4-7 (per footer text). Whatever the controlling agent's own jsonl recorded for this turn never made it to channel delivery as visible text.
  • usage is all zeros and stopReason: "stop" — from the gateway's perspective the turn ended cleanly, with no usage attribution. The footer's 21k in / — out was generated separately by the channel renderer from cached prompt-assembly metadata, not from a finalized usage record.
  • The user's preceding inbound message does not appear in the post-reset jsonl. The only record for the entire turn is the empty assistant stub.

Repro shape (1 observation; not yet replayed)

  1. Drive a Telegram group-topic lane through enough activity to fill conversation context (multiple ticket-style follow-ups).
  2. Send /new from the topic. Confirm ✅ New session started. lands.
  3. Immediately (within 30–60s) send a follow-up that references prior conversation state ("Continue with X", where X has no plain-text anchor in the workspace file tree).
  4. Wait. Single observed case: ~12 minutes later, channel receives footer-only delivery.

I have not yet attempted to reproduce deterministically. Filing now so the fingerprint (footer-only, — out, stopReason: "stop", ~12-min latency, post-/new on Telegram topic lane) is on record while it's fresh — happy to gather more repros and append once seen again.

Hypothesis (low-confidence, leave for upstream)

Two non-exclusive possibilities:

  1. Stream-level abort after long tool-call / thinking loop. The fresh session has no in-context history of "TASK-009". The model entered a tool-use / thinking loop attempting to reconstruct context from workspace files; the loop either hit a gateway-side or transport-side timeout and the stream was truncated without finalizing usage. The — out em-dash is consistent with stream truncation rather than empty completion.
  2. Model produced no text content blocks. If the controlling-agent run finished with only tool_use / thinking content and no terminal text block, the channel delivery path would render an empty body. In that case the issue is partly cosmetic (the run did produce work, just not text the channel could surface).

The two are operator-indistinguishable from outside without the controlling agent's own jsonl.

What's wrong

1. Empty delivery has no fallback text

If a turn produces no text content (whether by abort or by model behavior), the channel should render something(no text response; <N> tool calls, <M> thinking tokens) or (stream aborted) — not a footer hanging off an invisible message body. The current behavior reads to the operator as "the agent is dead" when the underlying state may be "the agent worked silently".

2. No in-flight progress indicator across a 12-min gap

From user-message-sent at T to footer-rendered at T+12min, the topic shows no activity. If the agent is executing tool calls or running embedded subagents, surfacing even a coarse [agent is working: <N> tool calls so far] heartbeat would prevent the operator from concluding the lane is wedged and /new-ing again (which kicks off the same cycle).

3. — out vs 0 out distinction is invisible upstream

The em-dash carries useful diagnostic information (count never finalized → truncation) but the gateway does not log this distinction explicitly. Recording a streamComplete: false or usageFinalized: false field on the session record would make the difference machine-detectable rather than dependent on cosmetic footer rendering.

4. Fresh-session bootstrap loads a 21k-token prompt with no conversation history

Not a bug per se, but worth noting: a fresh post-reset session immediately attracts a 21k-token system prompt + skills loadout (99% cached, as expected on a warm runtime). When the operator's next message references prior conversation context, the model receives 21k of system context + 1 short user turn pointing at state that was just archived. This is a difficult bootstrap that may amplify the abort/silent-loop pathology above. A documented "warm-restart" pattern — or a /reset --warm "<short re-orientation>" form that injects a re-orientation system note — would smooth the operator UX.

Asks

  1. Render a non-empty fallback when a turn produces no text content blocks — distinguish "stream aborted mid-flight" from "model emitted no text" in the channel output, even if just with a one-liner.
  2. Emit an in-flight heartbeat to the channel when a turn spends >Ns without progress. The current gap-of-silence is operator-hostile when something is happening underneath.
  3. Record usageFinalized: bool (or equivalent) on the session jsonl message record so a tail of "— out" outcomes can be counted machine-side.
  4. Investigate root cause of the silent abort — possible streaming hang on claude-opus-4-7 via the openai-responses transport when invoked immediately after session_start lifecycle hook completes.
  5. Document the fresh-session re-orientation pattern in the /reset / /new help text. Operators reaching for these commands typically want to unstick a lane, not blow away conversation context — clarifying the semantics in-band would reduce the "I reset and now everything is broken" path.

Related


Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:needs-infoClawSweeper needs more reporter information before it can verify this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.impact:crash-loopCrash, hang, restart loop, or process-level availability failure.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🦪 silver shellfishThin issue quality; more reproduction proof or environment detail is needed.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions