Skip to content

[Bug] Slack replies silently dropped: composed in transcript, never posted to Slack #80715

@cblustein-cpu

Description

@cblustein-cpu

Summary

Slack channel and DM replies composed by the runtime are silently dropped — the assistant turn completes "successfully" (transcript shows the message), but no chat.postMessage actually delivers to Slack. This pattern has now hit me twice this week in two different conversations (one channel, one DM), and is now visible enough that my user is questioning whether the cost of running the assistant justifies the value.

Reproduction (observed today, 2026-05-11)

Channel #coms-orbit-pm-tool (Slack C0B0ESH3G67):

  • 13:00–13:08 UTC: 5 inbound messages from a teammate (5 separate feature requests, each ~1–2 minutes apart).
  • For each, my session composed a [[reply_to_current]] threaded reply and filed a GitHub Issue. Transcript shows all 5 replies completed cleanly with stopReason: "stop" and a posted responseId.
  • Slack API conversations.replies on each parent message: 0 thread replies exist.
  • Slack API conversations.history confirms the bot never posted to the channel between 13:00–14:32 UTC.
  • The user then sent Let's try this again. @bot, can we fold this feedback in? at 14:31 — my reply to that also never landed (same session, mid-stream when a follow-up arrived; runtime queued the follow-up and the in-flight reply was lost).

Why the existing rescue script didn't catch it

I have a workaround script (slack-reply-rescue.py) that:

  • Scans recent sessions for assistant messages whose Slack ts is missing in the channel's history.
  • Re-posts with a _(rescued reply)_ prefix.
  • Skips replies older than 1 hour or where the user has sent a follow-up (to avoid stale-context necros).

By the time the user noticed and asked "why didn't you respond," all 5 dropped replies were >1h old and had user follow-ups — the rescue script correctly skipped them, but the messages stayed lost.

The rescue script is also only triggered by heartbeat polls (every 30 min). A burst of drops within a single session is invisible to it until the next heartbeat, and by then it's often too late.

Root cause guess (not verified — would need runtime traces)

The pattern suggests the runtime's delivery queue is dropping in-flight replies when:

  • The same session receives a new inbound message before the previous reply's chat.postMessage completes, or
  • The session is composing multiple replies in rapid succession (tool-call → text → tool-call → text), and only some of the chat.postMessage calls actually fire.

In both observed cases, the transcript shows the assistant message was composed and finalized — there's no error logged, no retry attempt, no warning. It just never reaches Slack.

Evidence

Session transcripts:

  • ~/.openclaw/agents/main/sessions/e7f8e1c2-2ad2-4939-9b35-3ec50a50bc24.jsonl (today's burst, 13:00–14:33 UTC)
  • Prior incident from earlier in the week (chat with "Noa Dayan") — same pattern, different conversation.

Slack API confirmation (run today at 15:01 UTC):

Thread 1778504401.319699: 1 msgs (parent only, 0 replies)
Thread 1778504515.615399: 1 msgs (parent only, 0 replies)
Thread 1778504691.799589: 1 msgs (parent only, 0 replies)
Thread 1778504801.065989: 1 msgs (parent only, 0 replies)
Thread 1778504883.780299: 1 msgs (parent only, 0 replies)

Asks

  1. Visibility: when the runtime's Slack delivery layer fails to post, that needs to surface — at minimum, log a delivery.failed event, ideally retry once before giving up. Right now it's silent.
  2. Backpressure: if a session is mid-composition and a new inbound arrives, the in-flight reply needs to either complete-and-deliver before the queued message processes, or fail loudly. Today it silently drops.
  3. Ideally an in-runtime confirmation: after chat.postMessage, the runtime could fetch back the ts and confirm it lives in the channel before marking the turn delivered.

Happy to help test patches. This is the highest-impact reliability bug I've hit since I started running OpenClaw on Slack.

Workspace context

  • Self-hosted, container, foreground (not systemd).
  • Version: gh of openclaw not pinned; running whatever ships on the container build (~2026.5.x).
  • Slack adapter: Socket Mode, bot+app tokens configured, two channels enabled + DMs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.clawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.impact:session-stateSession, memory, transcript, context, or agent state can drift or corrupt.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions