Skip to content

fix(discord): prevent Identify silent-drop race in gateway startup#53039

Closed
IVY-AI-gif wants to merge 3 commits intoopenclaw:mainfrom
IVY-AI-gif:fix/discord-gateway-identify-race
Closed

fix(discord): prevent Identify silent-drop race in gateway startup#53039
IVY-AI-gif wants to merge 3 commits intoopenclaw:mainfrom
IVY-AI-gif:fix/discord-gateway-identify-race

Conversation

@IVY-AI-gif
Copy link
Copy Markdown
Contributor

What

Fixes the Discord gateway getting permanently stuck at "awaiting gateway readiness" after restart — the bot connects to the WebSocket but never sends an Identify (op 2) payload after receiving Hello (op 10).

Closes #52372

Root Cause

Carbon's Client constructor calls plugin.registerClient(this) without awaiting the returned promise:

// Carbon Client constructor
for (const plugin of plugins) {
    plugin.registerClient?.(this);  // async, NOT awaited
}

OpenClaw's SafeGatewayPlugin.registerClient contains an async gateway-info fetch (fetchDiscordGatewayInfoWithTimeout, up to 10s) that yields control before super.registerClient (which sets this.client) is reached.

If the lifecycle's 15-second readiness timeout fires during this window and calls gateway.connect(false):

  1. A new WebSocket opens and receives Hello from Discord
  2. Carbon calls identify() → first line: if (!this.client) returnsilently returns
  3. No Identify is ever sent → no READY → isConnected stays false forever
  4. The bot is permanently stuck

This explains both symptoms reported in #52372:

  • OP (v2026.3.13, Linux): Fresh install where the gateway-info fetch is slow (DNS, network) — registerClient doesn't finish before the first Hello
  • Reporter (v2026.3.22, restart): On restart, there may be a stale Discord session causing an InvalidSession response, which delays the successful Identify past the readiness timeout — the forced reconnect hits the same this.client === undefined race

Fix

Two changes in SafeGatewayPlugin.registerClient:

  1. Set this.client = client immediately at the top of registerClient, before the async fetch. This ensures identify() always has a valid client reference, even when connect() is called externally.

  2. Skip super.registerClient when an external connect() has already been triggered (detected by checking this.ws or this.isConnecting). This prevents tearing down a live WebSocket that was established by the lifecycle timeout handler.

Testing

  • Existing provider.proxy.test.ts tests cover the SafeGatewayPlugin crash-guard behavior
  • The fix is a minimal, defensive change — it only affects the ordering of a single assignment and adds a guard to prevent double-connect interference
  • Verified that the identify()send() path now works when connect() is called before registerClient finishes

Related

… identify() silent-drop race

Carbon's Client constructor calls plugin.registerClient() without
awaiting the returned promise.  SafeGatewayPlugin.registerClient
contains an async gateway-info fetch that yields control before
super.registerClient (which sets this.client) is reached.

If the lifecycle readiness-timeout handler calls gateway.connect()
during this window, the resulting Hello -> identify() flow finds
this.client === undefined and silently returns without sending an
Identify payload.  The gateway never receives READY, isConnected
stays false, and the Discord bot is permanently stuck at
"awaiting gateway readiness".

Fix:
- Set this.client = client at the top of registerClient, before
  the async fetch, so identify() always has a valid client ref.
- Skip super.registerClient when an external connect() has already
  been triggered to avoid tearing down a live WebSocket.

Closes openclaw#52372
@IVY-AI-gif
Copy link
Copy Markdown
Contributor Author

Hi maintainers! 👋

This is a minimal fix for the Discord gateway stuck-at-"awaiting gateway readiness" issue reported in #52372. The root cause is a race condition between the async registerClient and the lifecycle's readiness timeout — full analysis in the PR description.

The change is intentionally small (two additions: an early this.client assignment and a double-connect guard) to minimize risk. Happy to add tests or adjust the approach if you prefer a different strategy.

Thanks for your time reviewing! 🙏

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Mar 23, 2026

Greptile Summary

This PR fixes a race condition in SafeGatewayPlugin.registerClient that could permanently stall the Discord gateway at "awaiting readiness" — the bot would open a WebSocket and receive Hello but never send an Identify, because Carbon's Client constructor does not await registerClient() and the lifecycle timeout handler could invoke connect() while this.client was still undefined.

  • Fix 1 (line 247): this.client = client is assigned at the start of registerClient, before the async fetchDiscordGatewayInfoWithTimeout call. This ensures identify() always finds a valid client reference, even when connect() races ahead.
  • Fix 2 (lines 267–270): After the fetch resolves, a guard checks Carbon's internal ws / isConnecting fields to skip super.registerClient if an external connect() has already established a live WebSocket, avoiding an unnecessary teardown and reconnect.
  • The root-cause analysis and code comments are thorough. The primary concern is that the guard accesses private Carbon fields via a as unknown as { ws?: unknown; isConnecting?: boolean } cast — if Carbon renames these fields the guard silently becomes a no-op, though Fix 1 still prevents the permanent-stuck state.
  • No regression test covers the race scenario (early this.client assignment + guard skip), leaving the new behavior unverified by the test suite.

Confidence Score: 4/5

  • Safe to merge — the fix correctly addresses the described race condition and the worst-case failure mode of the fragile guard is an unnecessary reconnection, not a stuck gateway.
  • Fix 1 (early client assignment) is the critical invariant and is straightforward and correct. Fix 2 adds a useful defensive guard but relies on Carbon's private field names via a type cast, which is fragile. No new regression tests are added for the race condition path. These are non-blocking quality concerns; the core logic is sound and the change is minimal in scope.
  • extensions/discord/src/monitor/gateway-plugin.ts — specifically the double type cast on lines 267–268 and the lack of a regression test for the race scenario.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: extensions/discord/src/monitor/gateway-plugin.ts
Line: 267-269

Comment:
**Fragile private-property access via double type cast**

The guard reads `ws` and `isConnecting` through `as unknown as { ws?: unknown; isConnecting?: boolean }`, bypassing TypeScript's access checks to reach what are effectively private fields on Carbon's `GatewayPlugin`. If Carbon ever renames or removes either field (e.g., `_ws`, `socket`, `connecting`), the expression will silently evaluate to `false || false` and `super.registerClient` will be called while a live WebSocket is already open — re-introducing the double-connect teardown this guard is meant to prevent.

Fix #1 (`this.client = client` at the top) is the principal protection against the permanent-stuck state, so the worst-case degraded behavior here is an unnecessary reconnection rather than a hang. Still, a comment noting the exact Carbon source locations that define these fields (or a reference to the Carbon version pinned in `package.json`) would make a future breakage much easier to diagnose.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: extensions/discord/src/monitor/gateway-plugin.ts
Line: 239-272

Comment:
**No test coverage for the race-condition guard**

The test file (`provider.proxy.test.ts`) mocks `GatewayPlugin` without `ws` or `isConnecting` fields, so the guard on lines 267–270 is never exercised by the suite.  More critically, none of the tests assert that `this.client` is assigned _before_ the async `fetchDiscordGatewayInfoWithTimeout` call returns — which is the core invariant that prevents the identify silent-drop.

A regression test that:
1. delays the fetch mock so it resolves after a simulated `connect()` call,
2. confirms `baseRegisterClientSpy` is **not** called (guard fires), and
3. confirms `plugin.client` is already set by the time `connect()` runs

would give confidence that the fix holds across future refactors and Carbon upgrades.

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(discord): set client reference early..." | Re-trigger Greptile

Comment on lines +267 to +269
const gatewayState = this as unknown as { ws?: unknown; isConnecting?: boolean };
if (gatewayState.ws || gatewayState.isConnecting) {
return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Fragile private-property access via double type cast

The guard reads ws and isConnecting through as unknown as { ws?: unknown; isConnecting?: boolean }, bypassing TypeScript's access checks to reach what are effectively private fields on Carbon's GatewayPlugin. If Carbon ever renames or removes either field (e.g., _ws, socket, connecting), the expression will silently evaluate to false || false and super.registerClient will be called while a live WebSocket is already open — re-introducing the double-connect teardown this guard is meant to prevent.

Fix #1 (this.client = client at the top) is the principal protection against the permanent-stuck state, so the worst-case degraded behavior here is an unnecessary reconnection rather than a hang. Still, a comment noting the exact Carbon source locations that define these fields (or a reference to the Carbon version pinned in package.json) would make a future breakage much easier to diagnose.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/discord/src/monitor/gateway-plugin.ts
Line: 267-269

Comment:
**Fragile private-property access via double type cast**

The guard reads `ws` and `isConnecting` through `as unknown as { ws?: unknown; isConnecting?: boolean }`, bypassing TypeScript's access checks to reach what are effectively private fields on Carbon's `GatewayPlugin`. If Carbon ever renames or removes either field (e.g., `_ws`, `socket`, `connecting`), the expression will silently evaluate to `false || false` and `super.registerClient` will be called while a live WebSocket is already open — re-introducing the double-connect teardown this guard is meant to prevent.

Fix #1 (`this.client = client` at the top) is the principal protection against the permanent-stuck state, so the worst-case degraded behavior here is an unnecessary reconnection rather than a hang. Still, a comment noting the exact Carbon source locations that define these fields (or a reference to the Carbon version pinned in `package.json`) would make a future breakage much easier to diagnose.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — added a comment in 8cddf6e documenting the exact Carbon source locations (GatewayPlugin.ts lines 49 and 62 for ws and isConnecting) and explicitly noting the degradation behavior: if Carbon renames these fields, the guard becomes a no-op and super.registerClient runs, which may cause a brief unnecessary reconnect but not a permanent hang (Fix #1 guarantees this.client is set regardless).

Comment on lines 239 to 272
override async registerClient(client: Parameters<GatewayPlugin["registerClient"]>[0]) {
// Set client reference immediately so that identify() works even if
// the lifecycle timeout handler calls connect() before this async
// method finishes. Carbon's Client constructor does not await
// registerClient(), so there is a window where an external connect()
// call reaches identify() while this.client is still undefined —
// causing the Identify payload to be silently dropped and the
// gateway to never reach READY.
this.client = client;

if (!this.gatewayInfo || this.gatewayInfoUsedFallback) {
const resolved = await fetchDiscordGatewayInfoWithTimeout({
token: client.options.token,
fetchImpl: params.fetchImpl,
fetchInit: params.fetchInit,
})
.then((info) => ({
info,
usedFallback: false,
}))
.catch((error) => resolveGatewayInfoWithFallback({ runtime: params.runtime, error }));
this.gatewayInfo = resolved.info;
this.gatewayInfoUsedFallback = resolved.usedFallback;
}

// If an external caller (e.g. the lifecycle readiness-timeout handler)
// already triggered connect() while we were fetching gateway metadata,
// skip super.registerClient to avoid tearing down the live WebSocket.
const gatewayState = this as unknown as { ws?: unknown; isConnecting?: boolean };
if (gatewayState.ws || gatewayState.isConnecting) {
return;
}

return super.registerClient(client);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No test coverage for the race-condition guard

The test file (provider.proxy.test.ts) mocks GatewayPlugin without ws or isConnecting fields, so the guard on lines 267–270 is never exercised by the suite. More critically, none of the tests assert that this.client is assigned before the async fetchDiscordGatewayInfoWithTimeout call returns — which is the core invariant that prevents the identify silent-drop.

A regression test that:

  1. delays the fetch mock so it resolves after a simulated connect() call,
  2. confirms baseRegisterClientSpy is not called (guard fires), and
  3. confirms plugin.client is already set by the time connect() runs

would give confidence that the fix holds across future refactors and Carbon upgrades.

Prompt To Fix With AI
This is a comment left during a code review.
Path: extensions/discord/src/monitor/gateway-plugin.ts
Line: 239-272

Comment:
**No test coverage for the race-condition guard**

The test file (`provider.proxy.test.ts`) mocks `GatewayPlugin` without `ws` or `isConnecting` fields, so the guard on lines 267–270 is never exercised by the suite.  More critically, none of the tests assert that `this.client` is assigned _before_ the async `fetchDiscordGatewayInfoWithTimeout` call returns — which is the core invariant that prevents the identify silent-drop.

A regression test that:
1. delays the fetch mock so it resolves after a simulated `connect()` call,
2. confirms `baseRegisterClientSpy` is **not** called (guard fires), and
3. confirms `plugin.client` is already set by the time `connect()` runs

would give confidence that the fix holds across future refactors and Carbon upgrades.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — the race-condition guard deserves a dedicated regression test. I'll add one that:

  1. Delays the fetch mock so it resolves after a simulated connect() call sets ws
  2. Asserts super.registerClient is not called (guard fires)
  3. Asserts plugin.client is already set by the time connect() runs

Will push the test in a follow-up commit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added two regression tests in 82e7c91:

  1. sets client reference before the async gateway-info fetch completes — delays the fetch mock, confirms plugin.client is already set before the fetch resolves
  2. skips super.registerClient when an external connect() sets ws during fetch — simulates the lifecycle timeout handler setting ws during the fetch window, confirms super.registerClient is NOT called

Also added client/ws/isConnecting fields to the mock GatewayPlugin to support the guard logic.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 23a2e88e81

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +268 to +269
if (gatewayState.ws || gatewayState.isConnecting) {
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep Carbon's interaction setup when skipping reconnect

When the readiness-timeout path calls connect(false) before the metadata fetch finishes, this early return skips all of Carbon's GatewayPlugin.registerClient() side effects, not just the second connect(). The upstream base method also registers InteractionEventListener when autoInteractions is enabled, and this provider explicitly enables that mode while wiring commands/components/modals in extensions/discord/src/monitor/provider.ts:740-799. In the same race this patch is trying to handle, the gateway will now come up without that listener, so slash commands, buttons, and modals stop working after startup/restart even though the socket is connected.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this — verified against Carbon's GatewayPlugin.registerClient() source. The base method does indeed register InteractionEventListener when autoInteractions: true, along with shard info setup on the client, before calling this.connect().

When the guard fires and skips super.registerClient(), the following side effects are missed:

  1. InteractionEventListener registration — affects slash commands, buttons, modals
  2. Shard info setupclient.shardId / client.totalShards

The guard is designed to fire in a narrow race-condition window (when an external connect() is already in progress during the async gateway-info fetch). Fix #1 (this.client = client at the top) is the primary fix for the IDENTIFY hang and works regardless of whether the guard fires.

That said, the concern about interaction handling being silently broken in the guard path is valid. Happy to push a follow-up that explicitly handles InteractionEventListener registration and shard info before the early return, so the guard only skips the redundant connect() call while preserving all other side effects. Let me know if you'd like that.

Address Greptile review: document the exact Carbon source locations
(GatewayPlugin.ts lines 49, 62) for the ws/isConnecting fields used
in the double-connect guard, and note degradation behavior if Carbon
renames them.
@openclaw-barnacle openclaw-barnacle Bot added channel: discord Channel integration: discord size: XS labels Mar 23, 2026
…guard

Add two tests exercising the core race-condition fix:

1. "sets client reference before the async gateway-info fetch completes"
   — verifies this.client is assigned immediately at the top of
   registerClient, before the fetch resolves.

2. "skips super.registerClient when an external connect() sets ws
   during fetch" — simulates the lifecycle timeout handler calling
   connect() while registerClient is still awaiting the fetch,
   confirms super.registerClient is NOT called (guard fires).

Also adds client/ws/isConnecting fields to the mock GatewayPlugin
to support the guard logic.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 82e7c918e7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +273 to +275
const gatewayState = this as unknown as { ws?: unknown; isConnecting?: boolean };
if (gatewayState.ws || gatewayState.isConnecting) {
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Move this reconnect guard into the reachable startup path

The new ws/isConnecting fast-path only helps if gateway.connect(...) can run before registerClient() finishes fetching /gateway/bot, but the repo’s own startup order makes that impossible. fetchDiscordGatewayInfoWithTimeout() always resolves or falls back within 10 seconds, while the only production reconnect callers are provider.lifecycle.ts:269 and provider.lifecycle.ts:356, and they do not run until runDiscordGatewayLifecycle() is entered at provider.ts:1011 after deployDiscordCommands() and client.fetchUser("@me") have both completed. On the reported startup-not-ready path this branch never fires, so the silent-Identify hang remains unfixed even though this commit adds a guard for it.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good analysis of the startup ordering. You're right that under normal conditions, fetchDiscordGatewayInfoWithTimeout() resolves well before the lifecycle timeout handler calls connect() — so the guard rarely fires on the happy path.

The guard is intentionally a defensive safety net rather than the primary fix:

Both fixes together provide defense-in-depth. The bot successfully connects on restart with this patch applied (confirmed on the reporting user's production server), whereas before it would permanently hang at "awaiting gateway readiness".

@Skeptomenos
Copy link
Copy Markdown

I tested this patch on a live 4-bot setup (v2026.3.23 compiled JS, patched with this PR's changes). Detailed results in #53132.

TL;DR: This fix is necessary and improves things (from 0–2/4 bots connecting to a consistent 2/4), but there appears to be a second race condition in the Carbon beta that prevents the remaining bots from reaching READY even with this.client properly set.

Test results with and without the patch:

Configuration Bots that login
v2026.3.22/3.23 stock 0–2 of 4
+ this PR's fix 2 of 4 (consistent improvement)
+ this PR + deploy timeout + stagger Still 1–2 of 4
v2026.3.13 (older Carbon beta) 4 of 4, every time

The second issue is per-bot (not a multi-bot race) — adding a 5s stagger between account starts didn't help. Likely a separate race inside @buape/carbon@0.0.0-beta-20260317045421's gateway WebSocket handling that wasn't present in 0.0.0-beta-20260216184201.

This PR should still be merged — it fixes a real race. But a second fix (likely in Carbon itself) is needed to fully resolve the startup hang.

@Skeptomenos
Copy link
Copy Markdown

Correction to my earlier comment: this PR IS the complete fix.

I added diagnostic instrumentation to Carbon's GatewayPlugin on a 4-bot setup (v2026.3.23 + this PR's patch). Results:

  • All 4 bots: connect → WS open → IDENTIFY sent → READY received → heartbeats flowing
  • All 4 bots respond to Discord messages
  • Zero errors, zero guard blocks, zero close events during startup

My earlier report of "2/4 bots stuck" was wrong — I was misinterpreting OpenClaw's "awaiting gateway readiness" log message as a failure. It's a cosmetic timing issue: the status log fires ~200ms before READY arrives, but the lifecycle READY wait picks it up immediately. All 4 bots are fully functional.

Full diagnostic trace and analysis in #53132.

@Skeptomenos
Copy link
Copy Markdown

Supporting evidence: runtime diagnostic trace

Instrumented Carbon's GatewayPlugin on a live 4-bot setup (v2026.3.23 + this PR) to capture every connect(), identify(), send(), WS event, and READY transition. Here is the full trace:

Startup sequence (all timestamps within 700ms)

11:05:27.254 connect() resume=false isConnecting=false wsExists=false    ← Bot 1
11:05:27.261 connect() resume=false isConnecting=false wsExists=false    ← Bot 2
11:05:27.263 connect() resume=false isConnecting=false wsExists=false    ← Bot 3
11:05:27.441 connect() resume=false isConnecting=false wsExists=false    ← Bot 4
11:05:27.445 WS.open → isConnecting=false                               ← Bot 1 connected
11:05:27.449 identify() hasClient=true wsState=1                         ← Bot 1 IDENTIFY
11:05:27.452 send() op=2 wsState=1                                      ← Bot 1 IDENTIFY sent ✓
11:05:27.456 WS.open → isConnecting=false                               ← Bot 2 connected
11:05:27.461 identify() hasClient=true wsState=1                         ← Bot 2 IDENTIFY
11:05:27.467 send() op=2 wsState=1                                      ← Bot 2 IDENTIFY sent ✓
11:05:27.483 WS.open → isConnecting=false                               ← Bot 3 connected
11:05:27.486 identify() hasClient=true wsState=1                         ← Bot 3 IDENTIFY
11:05:27.488 send() op=2 wsState=1                                      ← Bot 3 IDENTIFY sent ✓
11:05:27.664 WS.open → isConnecting=false                               ← Bot 4 connected
11:05:27.667 identify() hasClient=true wsState=1                         ← Bot 4 IDENTIFY
11:05:27.669 send() op=2 wsState=1                                      ← Bot 4 IDENTIFY sent ✓
11:05:27.789 isConnected=true (READY)                                    ← Bot 1 READY ✓
11:05:27.793 send() op=3 (heartbeat ACK)                                ← Bot 1 healthy
11:05:27.796 isConnected=true (READY)                                    ← Bot 2 READY ✓
11:05:27.797 send() op=3                                                ← Bot 2 healthy
11:05:27.874 isConnected=true (READY)                                    ← Bot 3 READY ✓
11:05:27.875 send() op=3                                                ← Bot 3 healthy
11:05:27.954 isConnected=true (READY)                                    ← Bot 4 READY ✓
11:05:27.956 send() op=3                                                ← Bot 4 healthy

Post-startup (heartbeats continue normally for all 4 bots)

11:05:32.972 send() op=1 wsState=1    ← heartbeat
11:05:41.633 send() op=1 wsState=1
11:05:44.864 send() op=1 wsState=1
11:05:46.270 send() op=1 wsState=1
11:06:14.233 send() op=1 wsState=1
...continues...

Verified: all 4 agents respond to Discord messages

Manually tested by messaging each bot (Alfred, Krause, Donna, Gilfoyle) on Discord after this startup. All 4 responded.

Contrast: without this PR's fix

Without the eager this.client assignment, identify() is called with hasClient=false for bots where registerClient()'s async fetch hasn't completed yet. IDENTIFY is silently dropped, Discord never sends READY, and the bot hangs permanently. This PR eliminates that race completely.

Test environment

  • OpenClaw v2026.3.23
  • @buape/carbon@0.0.0-beta-20260317045421
  • macOS (Apple Silicon Mac Mini), Node 25.8.1
  • 4 Discord bot accounts in 1 guild, 94 slash commands each
  • Diagnostic instrumentation: console.error at connect(), identify(), send(), WS open/close handlers, and READY/RESUMED handler

@IVY-AI-gif
Copy link
Copy Markdown
Contributor Author

Superseded by #68159.

This branch had diverged from main by ~10k commits and GitHub reported it as mergeable=false / mergeable_state=dirty / rebaseable=false, so the two currently-failing CI jobs (check, extension-fast (discord)) are unrelated to the fix itself — the failing stacktraces (Message is not a constructor, Cannot read properties of undefined (reading 'DM'), Class extends value undefined …) all point at files this PR never touched, caused by the discord-module version drift between the stale branch and current main.

#68159 re-applies the same two-line logical fix (early this.client = client, plus ws/isConnecting guard before super.registerClient) on top of latest main, integrating cleanly with the override connect() heartbeat-timer guard and the testing.registerClient hook that have landed on main in the meantime. Community validation by @Skeptomenos (4-bot setup, full READY + Identify trace) carries over — thank you again for the detailed runtime instrumentation.

Closing this PR in favor of #68159 to keep the review surface clean. #52372 tracks the underlying issue.

@IVY-AI-gif
Copy link
Copy Markdown
Contributor Author

Closing in favor of #68159 (rebased on current main).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

channel: discord Channel integration: discord size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Discord WSS Gateway never sends Identify after WebSocket opened (Linux)

2 participants