You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The bug fixed in #63450 / PR #63480 ("Gateway channel sidecar startup blocked by chat.history WS request, ~80–110 s delay since v2026.4.8", closed 2026-04-09) has returned in v2026.4.25 with a longer delay (~180 s instead of ~80–110 s). The original issue thread is locked to comments, so filing this as a separate regression report.
Symptom matches #63450's description nearly verbatim: between the starting channels and sidecars... log line and the moment channels (Browser control, Telegram provider, acpx runtime) actually start, the gateway sits silent for ~3 minutes. CLI WebSocket handshakes time out during the window, and any inbound Telegram message that arrives during it gets queued and replied to ~2 minutes after it lands.
Steps to reproduce
Install v2026.4.25 stable on Linux Debian 11 (Node 24.13.0 via NVM, npm-global install). Eager bundled-plugin postinstall (OPENCLAW_EAGER_BUNDLED_PLUGIN_DEPS=1) completed cleanly — ~/.openclaw/plugin-runtime-deps/openclaw-2026.4.25-<hash>/ is fully populated, so this is not on-demand plugin compilation.
Restart the gateway (systemctl --user restart openclaw-gateway on a user systemd setup).
Watch the gateway log. The ready (N plugins: ...; XX.Xs) line lands in ~10–12 s. The next non-noise log entry is ~3 minutes later.
During the ~3-minute window, run any openclaw CLI command (e.g. openclaw cron list). It hangs, and after 30 s the gateway log shows WARN handshake timeout conn=... peer=127.0.0.1:<ephemeral>.
Send a Telegram message to the bot during the window. The reply is sent ~2 minutes later (specifically, telegram sendMessage ok appears in the gateway log roughly 8 minutes after the restart).
Expected behavior
Per the resolution of #63450, channel sidecars should start within ~3 s of starting channels and sidecars, not ~3 minutes. PR #63480 fixed this regression at v4.8; v4.25 has reintroduced an equivalent or worse blocker.
Actual behavior — log evidence (two independent restarts on same machine)
Restart 1 — 2026-04-27 13:09:33Z
13:09:48.391 INFO ready (10 plugins: acpx, active-memory, bonjour, browser, device-pair,
memory-core, memory-wiki, phone-control, talk-voice, telegram; 10.2s)
13:09:48.521 INFO starting channels and sidecars...
13:09:48.733 INFO loaded 4 internal hook handlers
↑ ~3 min of silence ↑
13:12:53.xxx WARN handshake timeout conn=... peer=127.0.0.1:56754 -> 127.0.0.1:18789
13:12:59.xxx INFO Browser control listening on http://127.0.0.1:18791/ (auth=token)
13:12:59.xxx INFO [default] starting provider (@raywu07_bot)
Restart 2 — 2026-04-27 13:52:31Z
13:52:48.055 INFO ready (10 plugins: ...; 11.6s)
13:52:48.190 INFO starting channels and sidecars...
13:52:48.390 INFO loaded 4 internal hook handlers
↑ ~3 min of silence ↑
13:55:44.533 WARN handshake timeout conn=... peer=127.0.0.1:32940 -> 127.0.0.1:18789
13:55:44.553 INFO embedded acpx runtime backend registered
13:55:44.897 INFO Browser control listening on http://127.0.0.1:18791/ (auth=token)
13:55:45.823 INFO [default] starting provider (@raywu07_bot)
14:00:39.221 INFO telegram sendMessage ok chat=... message=5593
↑ first reply finally sent — 8 min after restart
Diagnostic notes
Telegram channel itself is healthy throughout the window — getMe, getMyCommands, getWebhookInfo all respond fast (<300 ms) over the Bot API. The bot is reachable; the gateway just hasn't started its provider client yet.
HTTP routes work fast — GET /health returns 200 in ~5 ms during the window. Only WebSocket handshakes (/__openclaw__/ws) hang.
No plugin error in the ready line — all 10 plugins reported loaded successfully.
OPENCLAW_EAGER_BUNDLED_PLUGIN_DEPS=1 does not help — the bundled-plugin install ran at upgrade time and the staged tree was complete by the time the gateway started. Whatever the channel-startup is waiting on, it's not on-disk plugin compilation.
The v2026.4.25 release notes touch the runtime-context / chat-history path in several places:
"Heartbeat, cron, and exec wakeups submitted as transient runtime context (removed from visible transcripts)"
"Sessions separate reset freshness from store updatedAt (heartbeat/cron/exec no longer prevent daily/idle resets)"
"Embedded runtime context sent as hidden next-turn custom message (not visible user prompt)"
"Doctor repairs 2026.4.24 transcripts with duplicated prompt-rewrite branches"
If any of these reintroduced a sync chat.history read on the channel-startup path (which was the root cause #63480 originally fixed by deferring), the symptom and timing would match exactly. Worth comparing the channel-startup hook order against the pre-#63480 codepath.
Stop logging ready (...) until channels are actually up — the misleading ready log makes ops scripts (and humans) assume the gateway is usable when it isn't. Either delay the ready line by ~3 minutes (matches reality) or emit a separate available event after channels register.
Bug type
Regression (worked before, now fails)
Beta release blocker
No
Summary
The bug fixed in #63450 / PR #63480 ("Gateway channel sidecar startup blocked by chat.history WS request, ~80–110 s delay since v2026.4.8", closed 2026-04-09) has returned in v2026.4.25 with a longer delay (~180 s instead of ~80–110 s). The original issue thread is locked to comments, so filing this as a separate regression report.
Symptom matches #63450's description nearly verbatim: between the
starting channels and sidecars...log line and the moment channels (Browser control, Telegram provider, acpx runtime) actually start, the gateway sits silent for ~3 minutes. CLI WebSocket handshakes time out during the window, and any inbound Telegram message that arrives during it gets queued and replied to ~2 minutes after it lands.Steps to reproduce
OPENCLAW_EAGER_BUNDLED_PLUGIN_DEPS=1) completed cleanly —~/.openclaw/plugin-runtime-deps/openclaw-2026.4.25-<hash>/is fully populated, so this is not on-demand plugin compilation.systemctl --user restart openclaw-gatewayon a user systemd setup).ready (N plugins: ...; XX.Xs)line lands in ~10–12 s. The next non-noise log entry is ~3 minutes later.openclawCLI command (e.g.openclaw cron list). It hangs, and after 30 s the gateway log showsWARN handshake timeout conn=... peer=127.0.0.1:<ephemeral>.telegram sendMessage okappears in the gateway log roughly 8 minutes after the restart).Expected behavior
Per the resolution of #63450, channel sidecars should start within ~3 s of
starting channels and sidecars, not ~3 minutes. PR #63480 fixed this regression at v4.8; v4.25 has reintroduced an equivalent or worse blocker.Actual behavior — log evidence (two independent restarts on same machine)
Restart 1 — 2026-04-27 13:09:33Z
Restart 2 — 2026-04-27 13:52:31Z
Diagnostic notes
getMe,getMyCommands,getWebhookInfoall respond fast (<300 ms) over the Bot API. The bot is reachable; the gateway just hasn't started its provider client yet.GET /healthreturns 200 in ~5 ms during the window. Only WebSocket handshakes (/__openclaw__/ws) hang.readyline — all 10 plugins reported loaded successfully.OPENCLAW_EAGER_BUNDLED_PLUGIN_DEPS=1does not help — the bundled-plugin install ran at upgrade time and the staged tree was complete by the time the gateway started. Whatever the channel-startup is waiting on, it's not on-disk plugin compilation.active-memoryplugin's pre-reply sub-agent run elapses ~45 s for a configuredtimeoutMs: 15000. Documented separately on [Bug]: Stuck processing sessions are detected but never aborted — gateway requires external restart to recover #71127. If both symptoms share the same lock/queue holdup, fixing this regression should fix that secondary case too.Suspected cause (informed guess)
The v2026.4.25 release notes touch the runtime-context / chat-history path in several places:
If any of these reintroduced a sync
chat.historyread on the channel-startup path (which was the root cause #63480 originally fixed by deferring), the symptom and timing would match exactly. Worth comparing the channel-startup hook order against the pre-#63480 codepath.Environment
npm install -g openclaw@2026.4.25 --ignore-scripts+ restore deps + eager bundled-plugin postinstallOPENCLAW_DISABLE_BONJOUR=1), browser, device-pair, memory-core, memory-wiki, phone-control, talk-voice, telegramOPENCLAW_SERVICE_REPAIR_POLICY=external,OPENCLAW_DISABLE_BONJOUR=1,OPENCLAW_EAGER_BUNDLED_PLUGIN_DEPS=1Mitigation suggestion
Two options worth considering, possibly in combination:
ready (...)until channels are actually up — the misleadingreadylog makes ops scripts (and humans) assume the gateway is usable when it isn't. Either delay thereadyline by ~3 minutes (matches reality) or emit a separateavailableevent after channels register.Related