[Bug]: Slack channel final replies silently dropped on 2026.5.3 when agent emits [thinking,text]-only turn under visibleReplies="message_tool"

## Summary

On OpenClaw `2026.5.3` with `messages.groupChat.visibleReplies = "message_tool"` (the schema-migration default applied by `doctor --fix`) and a Slack-backed agent (`claude-cli/claude-opus-4-7`), an assistant turn that produces `thinking → text → end_turn` with **zero `mcp__openclaw__message` tool calls** completes cleanly, persists the final text to the claude-cli transcript, and is silently dropped by the Slack delivery layer. From the user's perspective the agent goes silent in the Slack thread — indistinguishable from "still thinking" or "stuck".

The current contract is consistent with `visibleReplies = "message_tool"` as documented, but:

1. There is **no log line** in `gateway.log` indicating that delivery was suppressed. The failure is invisible without grepping the claude-cli per-session JSONL.
2. There is **no runtime safety net** (auto-promote, hard diagnostic, or fallback) when a group-chat turn closes without the required tool call.
3. This is the **Slack analog** of the Telegram-side bugs in #76554 / #76828 / #66459, which were closed (or remain open) for Telegram-specific code paths.

The agent calls `mcp__openclaw__message` correctly on most turns (~5/6 on the affected agent during the observation window). The failure mode is concentrated on `[thinking,text]`-only turn shapes — typically meta-discussion or short conversational replies where the model doesn't reach for tools. This is the same turn shape called out as still-broken in **open** issue #66459.

## Environment

- OpenClaw: `2026.5.3` (verified via `openclaw status` → `app 2026.5.3`)
- Runtime: Node `v24.15.0`, Linux `6.8.0-87-generic`, Hetzner CPX32
- Channel: Slack workspace `T9FGQDB9D`, group channel `C0B0XQ03ZN3` (`is_group_chat: true`), thread reply
- Provider/model: `claude-cli/claude-opus-4-7` (claude-cli `v2.1.116` / claude-cli session metadata reports `version 2.1.126` for the inner SDK)
- Agent: `podavach-shopify`
- claude-cli SDK session via OpenClaw bundled backend
- Active config keys (relevant):
  ```json
  {
    "messages": { "groupChat": { "visibleReplies": "message_tool" } },
    "channels": {
      "slack": {
        "accounts": {
          "podavach-shopify": {
            "dm": { "enabled": true, "groupEnabled": true },
            "replyToMode": "off",
            "replyToModeByChatType": { "direct": "...", "group": "...", "channel": "..." },
            "requireMention": true,
            "streaming": null
          }
        }
      }
    }
  }
  ```

## Reproduction

Pre-existing thread in a Slack group channel where the agent has already replied successfully via `mcp__openclaw__message` earlier in the day.

1. User posts a meta question in the thread that doesn't naturally invite tool use (e.g. asking the agent how its scheduling/triggers work, or a short conversational follow-up).
2. Agent's claude-cli session enqueues the inbound message normally:
   ```
   queue-operation enqueue → dequeue → user message → assistant turn
   ```
3. Assistant emits the turn as: one `thinking` content block (encrypted signature, ~1700 output tokens), then one `text` content block, then `stop_reason: end_turn`. No `tool_use` blocks.
4. The `text` content is persisted to `~/.claude/projects/<cwd-mangled>/<sessionId>.jsonl`.
5. **No corresponding outbound entry** appears in the OpenClaw delivery-mirror file `~/.openclaw/agents/<agent>/sessions/<sessionId>-topic-<thread_ts>.jsonl`.
6. **No `chat.postMessage` / Slack API send** appears in `journalctl --user -u openclaw-gateway.service` for that channel ID in the relevant window.
7. **No warning, error, or diagnostic** log line is emitted indicating that delivery was suppressed.
8. Slack thread shows the user's message and nothing after.

## Evidence — observed turn shape

Successful prior turn in the same session, same thread (`mcp__openclaw__message` was called):

```
12:16:50.568  assistant  thinking
12:17:02.535  assistant  tool_use:  mcp__openclaw__message      ← delivers to Slack
12:17:03.310  user       tool_result                            ← delivery confirmed
12:17:07.388  assistant  text:      "[…cropped agent reply…]"   ✅ DELIVERED
```

Failed turn ~3 minutes later in the same session, same thread (no tool call):

```
12:20:09       inbound:  Slack message in #website, was_mentioned=true,
                         reply_to_id=1777885573.897639, sender=Oleksii Nikitin (U9FAR6MHC),
                         message_id=1777897207.975269
12:20:10.008   queue-operation enqueue   sessionId=75dbc854-5235-4ede-8d17-9d205f557ea5
12:20:10.009   queue-operation dequeue
12:20:33.602   assistant  thinking            (1767 output tokens, encrypted signature)
                                              requestId=req_011CahWzSDh5DUAQLfksZeju
                                              msg_id=msg_01K2j7n2pm8u7mgnB99GZhq7
12:20:42.169   assistant  text:      "[…cropped agent reply…]"
                                              same requestId, same msg_id
12:20:42.169   end_turn   (stop_reason: end_turn, no tool_use blocks emitted)
```

Same `requestId` and `msg_id` on both `thinking` and `text` entries — this is one Anthropic API response, split into two transcript records as streaming chunks landed. The turn is structurally complete and clean.

Delivery-mirror file for the affected thread (last entry far predates the failed turn):

```
~/.openclaw/agents/podavach-shopify/sessions/aa828b52-167b-41e8-b6de-ff2e8970ee75-topic-1777885573.897639.jsonl
last entry: 2026-05-04T09:47:03.096Z   (~2.5h before failed turn)
```

Gateway journal for the channel ID during 12:19:00 – 12:35:00 window:

```
$ journalctl --user -u openclaw-gateway.service \
    --since "2026-05-04 12:19:00" --until "2026-05-04 12:35:00" \
    | grep -iE "C0B0XQ03ZN3|c0b0xq03zn3|deliver|chat\.post|sendMessage"
(no output)
```

Recurring stuck-session diagnostics earlier in the day on the same thread (different sessions / earlier turns) suggest this is not a one-off:

```
[diagnostic] stuck session: sessionId=podavach-shopify
sessionKey=agent:podavach-shopify:slack:channel:c0b0xq03zn3:thread:1777885573.897639
state=processing age=136s queueDepth=1
reason=queued_work_without_active_run classification=stale_session_state
```

These fire then `[diagnostic] stuck session recovery skipped: reason=active_reply_work action=keep_lane` — the gateway sees that work *is* in progress and stays out of the way, which is the right thing to do. But when the work finishes with no tool call, nothing fills the gap.

## What it is NOT

- **Not a stale-binary trap (#76554's correction).** `openclaw status` self-reports `app 2026.5.3`, MainPID 116231 was started post-upgrade at `2026-05-04 10:06:53 UTC`, the npm package on disk is `openclaw@2026.5.3`. Verified with `ps -o etime,start,cmd` against the systemd MainPID. The systemd unit `Description=OpenClaw Gateway (v2026.5.2)` is a stale cosmetic label in the unit file; the running binary is `~/.nvm/versions/node/v24.15.0/lib/node_modules/openclaw/dist/index.js` and is 5.3.
- **Not a transport-layer SSE drop (#68412).** The model output completed normally with `stop_reason: end_turn`, no streaming abort, no proxy in the path (claude-cli direct).
- **Not an embedded ACP turn failure (#65484).** Turn closed with `isError=false`, no `ACP_TURN_FAILED` / `AcpRuntimeError`.
- **Not a `final_only` deliveryMode issue (#76828).** `acp.stream.deliveryMode` is unset; the relevant gate is `messages.groupChat.visibleReplies`, not `acp.stream.deliveryMode`.
- **Not an auth / billing / model fallback issue.** Other turns in the same session before and after the failed turn delivered correctly via `mcp__openclaw__message`. Provider/auth healthy.
- **Not a `streaming.mode` regression.** Slack `streaming` is null on the affected account (no streaming configured); failure is in the final delivery path, not partial-stream coalescing.

## Expected behavior

When a group-chat turn closes under `messages.groupChat.visibleReplies = "message_tool"` with **zero `mcp__openclaw__message` tool calls** AND a non-empty final text content block, the runtime should do **at least one** of:

(a) **Auto-promote on turn-end.** Detect this turn shape and dispatch the final assistant text block to the originating chat using the channel's natural delivery path (same path DMs use). Preserves operator intent (model is still encouraged to call the tool) without dropping the reply.

(b) **Hard diagnostic.** Emit a gateway-log line at warn or error level:

```
[stream] turn closed with 0 message-tool emissions in group chat;
visibleReplies=message_tool suppressed N text chunks
sessionId=<id> sessionKey=<key> requestId=<req>
```

So operators can grep the failure rather than discovering it days later in a user complaint. Currently the failure is utterly silent in the gateway log; we had to grep `mcp__openclaw__message` `tool_use` entries against `assistant text` entries in the per-session claude-cli JSONL to find it.

(c) **Per-channel/runtime hint surfaced to the agent.** For sessions destined to group-chat threads under `visibleReplies = "message_tool"`, attach a system-prompt hint like `requiredDeliverySurface: "tool:mcp__openclaw__message"` so the agent does not have to relearn the contract on every turn.

Today the runtime appears to do none of these.

## Suggested fix direction

In order of operator-utility-per-engineering-cost:

1. **Hard diagnostic** (cheapest, highest debugging-value). One log line whenever a group-chat turn closes with zero `mcp__openclaw__message` tool calls under `visibleReplies = "message_tool"`. No behavior change, just observability. Solves the "silently lost reply" UX class even before any auto-promote work.
2. **Auto-promote-last-text on turn-end** for group chats under `visibleReplies = "message_tool"`. Mirrors the suggested fix in #76828 (auto-promote-last-commentary on `final_only`). Preserves the explicit-tool ergonomic for normal flows; covers the long tail of `[thinking,text]`-only meta turns.
3. **Per-account opt-in fallback mode** `visibleReplies = "message_tool_or_last_text"` so operators can opt into the safety net without giving up the explicit-tool default fleet-wide.

## Related issues

- **#76554** — Telegram forum topic final replies generated but not delivered on 2026.5.2. *Closed* (after the 5.2-vs-5.3 stale-binary correction). Same family on Telegram. Workaround D ("agent calls `message(action=send)` directly") is the explicit Telegram analog of the dance Slack agents must do under `visibleReplies = "message_tool"`.
- **#76828** — `deliveryMode="final_only"` silently drops long tool-loop replies when no `phase=final` chunk emitted. *Closed*. Adjacent code path; suggested fix direction (auto-promote-last-commentary, hard diagnostic) generalizes cleanly to this issue.
- **#66459** — Telegram transcript contains final reply but no outbound send for `[thinking,text]` turn on 2026.4.11. **Open**. Same turn shape as this report. The Telegram-side observation is the precise structural match.
- **#65484** — Gateway silently drops Slack replies when embedded ACP turn fails. *Closed*. Different mechanism, same user-visible "agent went silent on Slack" symptom.
- **#73619** — Default `message` tool `accountId` to the calling agent's own Slack account when omitted. **Open**. Reduces the cost of relying on explicit-tool delivery, complementary to (a) above.
- **#74674** — Delivery layer concatenates multiple text content items; pick-one policy needed. **Open**. Adjacent on the final-delivery code path.

## Notes / scope

- Reproduced on a real workload, not a synthetic test. Single agent across a Slack channel migration cohort. Not yet swept across the other 6 agents on the same host but the schema migration applied identically across all 7.
- Reproduces on `[thinking,text]`-only turn shapes; cannot trivially force the model into this shape, so a deterministic minimal repro is not yet attached. Happy to capture more transcripts if useful.
- Workspaces use claude-cli SDK sessions; an OpenAI-Codex / native-Anthropic provider was not tested for this surface in 5.3.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Slack channel final replies silently dropped on 2026.5.3 when agent emits [thinking,text]-only turn under visibleReplies="message_tool" #77320

Summary

Environment

Reproduction

Evidence — observed turn shape

What it is NOT

Expected behavior

Suggested fix direction

Related issues

Notes / scope

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Slack channel final replies silently dropped on 2026.5.3 when agent emits [thinking,text]-only turn under visibleReplies="message_tool" #77320

Description

Summary

Environment

Reproduction

Evidence — observed turn shape

What it is NOT

Expected behavior

Suggested fix direction

Related issues

Notes / scope

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions