Problem
When an agent uses sessions_send with timeoutSeconds > 0 to query another agent, and the announce step fails (e.g. after a gateway restart when channels are temporarily unavailable), the announce retry loop blocks the entire agent session for several minutes.
Observed behavior
- Agent A calls
sessions_send(sessionKey, message, timeoutSeconds: 60) targeting Agent B
- Agent B responds successfully
- OpenClaw attempts the "announce" step (posting the reply to Agent A's chat channel)
- Announce fails with
"Unknown channel: telegram" (channel not yet re-registered after gateway restart)
- Gateway retries 4 times with exponential backoff (5s → 10s → 20s delays, each with a 90s timeout)
- During this entire retry window (~6 minutes), Agent A's session is blocked — no inbound messages are processed
- From the user's perspective, the agent appears dead/unresponsive
Log evidence
22:36:29 [warn] Subagent announce completion direct announce agent call transient failure, retrying 2/4 in 5s: gateway timeout after 90000ms
22:38:04 [warn] Subagent announce completion direct announce agent call transient failure, retrying 3/4 in 10s: gateway timeout after 90000ms
22:39:44 [warn] Subagent announce completion direct announce agent call transient failure, retrying 4/4 in 20s: gateway timeout after 90000ms
22:41:23 [ws] ⇄ res ✗ agent errorCode=UNAVAILABLE errorMessage=Error: Unknown channel: telegram
Impact
- Agent becomes completely unresponsive for ~6 minutes
- User messages during this window are either dropped or queued silently
- In a multi-agent setup with frequent config changes (22 gateway restarts in one day via config.patch), this happens regularly
Environment
- OpenClaw 2026.3.22 (4dcc39c)
- Multi-agent setup (5 agents, Telegram channel with 4 bot accounts)
- Gateway restarts triggered by config.patch from Control UI
Suggested improvements
- Non-blocking announce: Run the announce step asynchronously so it does not block the agent session from processing new inbound messages
- Shorter announce timeout: The 90s gateway timeout per retry attempt is very long; a configurable timeout (e.g. 10-15s) would limit the blast radius
- Circuit breaker: If the channel is known-unavailable (e.g. post-restart), skip or defer the announce rather than retrying against a dead channel
- Agent-side control: Allow agents to opt into
ANNOUNCE_SKIP behavior via sessions_send parameters (e.g. announce: false) rather than requiring the target agent to reply with the magic string
Current workaround
Using timeoutSeconds: 0 (fire-and-forget) + sessions_history to read the response avoids the announce step entirely, but loses the synchronous request-reply convenience.
Problem
When an agent uses
sessions_sendwithtimeoutSeconds > 0to query another agent, and the announce step fails (e.g. after a gateway restart when channels are temporarily unavailable), the announce retry loop blocks the entire agent session for several minutes.Observed behavior
sessions_send(sessionKey, message, timeoutSeconds: 60)targeting Agent B"Unknown channel: telegram"(channel not yet re-registered after gateway restart)Log evidence
Impact
Environment
Suggested improvements
ANNOUNCE_SKIPbehavior viasessions_sendparameters (e.g.announce: false) rather than requiring the target agent to reply with the magic stringCurrent workaround
Using
timeoutSeconds: 0(fire-and-forget) +sessions_historyto read the response avoids the announce step entirely, but loses the synchronous request-reply convenience.