Summary
Single-account Discord bot hangs indefinitely at discord client initialized as <id> (<name>); awaiting gateway readiness on every start for all versions >= v2026.3.22. The gateway WebSocket never completes the IDENTIFY → READY handshake. v2026.3.13 works reliably with identical configuration.
Related to #53132 (multi-account variant), but this reproduces with a single bot account and is not fixed by v2026.3.24 or v2026.3.28, contrary to the multi-account resolution reported there.
Environment
- OpenClaw versions tested: v2026.3.22, v2026.3.24, v2026.3.28 (all hang); v2026.3.13 (works)
- OS: Linux Mint (6.17.0-19-generic, x86_64)
- Node: 25.8.0 (linuxbrew)
- Service: systemd user unit
- Discord accounts: 1 bot ("Kit"), 2 guilds, 63 native slash commands
- Gateway config: single-account, loopback bind, voice enabled
Reproduction
- Install openclaw >= v2026.3.22 (tested .22, .24, .28)
- Configure a single Discord bot account with native commands enabled
- Start the gateway:
systemctl --user start openclaw-gateway
- Observe logs:
[discord] native commands using Carbon reconcile path
[discord] client initialized as 1479251097339166872 (Kit); awaiting gateway readiness
- Bot never transitions to
logged in to discord as <id> (<name>). No timeout error is logged. Hangs indefinitely (tested 60+ seconds).
- Downgrade to v2026.3.13: bot logs in immediately with
gatewayConnected=true.
100% reproducible across 6+ clean starts on each version (stop → install → start with no rapid restarts).
Expected
Bot should reach logged in to discord within ~15 seconds, as it does on v2026.3.13.
Actual
Bot hangs at awaiting gateway readiness forever. The 15-second DISCORD_GATEWAY_READY_TIMEOUT_MS poll at provider.lifecycle.ts:6778 never fires its timeout branch (no "gateway was not ready after 15000ms" error is ever logged).
Diagnostic Evidence
Discord API is healthy
REST API works — bot identity, command deployment, and gateway info all succeed:
$ curl -s https://discord.com/api/v10/gateway/bot -H "Authorization: Bot <token>"
{"url":"wss://gateway.discord.gg","session_start_limit":{"remaining":727,"total":1000,...},"shards":1}
Raw WebSocket test succeeds instantly
Using the same token and Node.js ws module from OpenClaw's own node_modules:
WS OPEN
OP: 10 t: null d: {"heartbeat_interval":41250,...}
IDENTIFY sent
OP: 0 t: READY d: {"v":10,"user":{"username":"Kit",...},"session_type":"normal",...}
OP: 0 t: GUILD_CREATE ...
HELLO → IDENTIFY → READY completes in under 2 seconds. The token, intents, and network path are all valid.
Socket buffer shows unread data
During the hang, ss -tp showed 106 bytes sitting unread in the gateway process's receive buffer — Discord sent data but the Node.js event loop never consumed it:
ESTAB 106 0 10.2.0.2:55566 162.159.137.232:https users:(("openclaw-gatewa",...))
The connection was later silently dropped without processing.
Root Cause Analysis
Traced through the minified source in provider-CmA0Hwes.js.
The race condition (same as #53132 comment 3, but single-account)
Client constructor (line 151127 in pi-embedded-CzQCqSlH.js) calls plugin.registerClient?.(this) without awaiting the returned promise
SafeGatewayPlugin.registerClient (line 6247) is async — it awaits fetchDiscordGatewayInfoWithTimeout() before calling super.registerClient(client) (which calls this.connect())
- The constructor returns immediately. The gateway's
registerClient is a fire-and-forget async call
- OpenClaw proceeds through command deployment, identity fetch, and enters
runDiscordGatewayLifecycle()
- The lifecycle polls
gateway.isConnected every 250ms for 15 seconds
Why the timeout never fires
The 15-second timeout at line 6778 (waitForDiscordGatewayReady) should fire and trigger a forced reconnect at line 6796–6797. But in practice, no timeout error is ever logged. Two possible explanations:
Key code paths
// provider-CmA0Hwes.js:6777-6797
if (gateway && !gateway.isConnected && !lifecycleStopping) {
const initialReady = await waitForDiscordGatewayReady({
gateway,
timeoutMs: 15000, // DISCORD_GATEWAY_READY_TIMEOUT_MS
beforePoll: drainPendingGatewayErrors
});
if (initialReady === "timeout" && !lifecycleStopping) {
// This branch is NEVER reached in practice
runtime.error?.("discord: gateway was not ready after 15000ms; forcing a fresh reconnect");
gateway?.disconnect();
gateway?.connect(false); // connect() silently no-ops if this.client is undefined
}
}
// pi-embedded-CzQCqSlH.js:151126-151128 (Client constructor)
for (const plugin of plugins) {
plugin.registerClient?.(this); // NOT awaited — async registerClient is fire-and-forget
plugin.registerRoutes?.(this);
}
Why v2026.3.13 works
v2026.3.13 uses the older deploy-commands flow (REST PUT to /applications/{id}/commands) instead of the "Carbon reconcile path". Its GatewayPlugin.registerClient appears to complete synchronously or fast enough that the gateway connects before the lifecycle check. The gatewayConnected=true consistently appears 1–2 seconds after "WebSocket connection opened".
Difference from #53132
| Aspect |
#53132 |
This issue |
| Accounts |
4 bots, non-deterministic subset hangs |
1 bot, always hangs |
| v2026.3.24 |
Fixed |
Still broken |
| v2026.3.28 |
Not tested |
Still broken |
| Reproduction |
Non-deterministic (0–2 of 4 succeed) |
100% deterministic |
The single-account reproduction suggests the race condition is more fundamental than concurrent IDENTIFY contention. It may be environment-dependent (Node 25.8.0, Linux, specific gateway latency).
Workaround
Pinned to v2026.3.13. Gateway connects immediately and reliably on every start.
Summary
Single-account Discord bot hangs indefinitely at
discord client initialized as <id> (<name>); awaiting gateway readinesson every start for all versions >= v2026.3.22. The gateway WebSocket never completes the IDENTIFY → READY handshake. v2026.3.13 works reliably with identical configuration.Related to #53132 (multi-account variant), but this reproduces with a single bot account and is not fixed by v2026.3.24 or v2026.3.28, contrary to the multi-account resolution reported there.
Environment
Reproduction
systemctl --user start openclaw-gatewaylogged in to discord as <id> (<name>). No timeout error is logged. Hangs indefinitely (tested 60+ seconds).gatewayConnected=true.100% reproducible across 6+ clean starts on each version (stop → install → start with no rapid restarts).
Expected
Bot should reach
logged in to discordwithin ~15 seconds, as it does on v2026.3.13.Actual
Bot hangs at
awaiting gateway readinessforever. The 15-secondDISCORD_GATEWAY_READY_TIMEOUT_MSpoll atprovider.lifecycle.ts:6778never fires its timeout branch (no "gateway was not ready after 15000ms" error is ever logged).Diagnostic Evidence
Discord API is healthy
REST API works — bot identity, command deployment, and gateway info all succeed:
Raw WebSocket test succeeds instantly
Using the same token and Node.js
wsmodule from OpenClaw's ownnode_modules:HELLO → IDENTIFY → READY completes in under 2 seconds. The token, intents, and network path are all valid.
Socket buffer shows unread data
During the hang,
ss -tpshowed 106 bytes sitting unread in the gateway process's receive buffer — Discord sent data but the Node.js event loop never consumed it:The connection was later silently dropped without processing.
Root Cause Analysis
Traced through the minified source in
provider-CmA0Hwes.js.The race condition (same as #53132 comment 3, but single-account)
Clientconstructor (line 151127 inpi-embedded-CzQCqSlH.js) callsplugin.registerClient?.(this)without awaiting the returned promiseSafeGatewayPlugin.registerClient(line 6247) is async — itawaitsfetchDiscordGatewayInfoWithTimeout()before callingsuper.registerClient(client)(which callsthis.connect())registerClientis a fire-and-forget async callrunDiscordGatewayLifecycle()gateway.isConnectedevery 250ms for 15 secondsWhy the timeout never fires
The 15-second timeout at line 6778 (
waitForDiscordGatewayReady) should fire and trigger a forced reconnect at line 6796–6797. But in practice, no timeout error is ever logged. Two possible explanations:this.clientis undefined during the race window: Per Discord: multi-account gateway startup hangs at 'awaiting gateway readiness' after Carbon reconcile change (v2026.3.22) #53132 comment 3, if the timeout handler callsgateway.connect()whileregisterClienthasn't finished,identify()findsthis.client === undefinedand silently returns — no IDENTIFY is sent, no error is thrown,isConnectedstays false, and the next timeout also fails the same way.drainPendingGatewayErrors()encounters a fatal error orabortSignalis already aborted, the lifecycle returns without logging.Key code paths
Why v2026.3.13 works
v2026.3.13 uses the older deploy-commands flow (REST PUT to
/applications/{id}/commands) instead of the "Carbon reconcile path". ItsGatewayPlugin.registerClientappears to complete synchronously or fast enough that the gateway connects before the lifecycle check. ThegatewayConnected=trueconsistently appears 1–2 seconds after "WebSocket connection opened".Difference from #53132
The single-account reproduction suggests the race condition is more fundamental than concurrent IDENTIFY contention. It may be environment-dependent (Node 25.8.0, Linux, specific gateway latency).
Workaround
Pinned to v2026.3.13. Gateway connects immediately and reliably on every start.