Bug type
Performance / hang / runtime-deps reinstall loop / gateway health regression
Summary
After upgrading the same Ubuntu VPS that reproduced #75259 to the stable 2026.4.29 package, the original beta runtime-deps import failure appears fixed: the gateway reaches ready, Telegram starts, and there are no fresh Cannot find package 'json5', Cannot find package 'openclaw', Telegram channel exited, or model fallback lines.
However, after the gateway reaches ready and Telegram starts, the gateway becomes event-loop starved and repeatedly spawns high-CPU pnpm install child processes under the 2026.4.29 runtime-deps root. The child process is short-lived and then reappears with a new PID. During this loop, the gateway process itself stays hot, memory grows, local RPC/status calls time out, and WebSocket handshakes fail.
Important correction to the first version of this issue: an early process snapshot missed the pnpm install child. Later repeated sampling captured multiple gateway-owned pnpm install children.
Environment
OpenClaw 2026.4.29 (a448042)
OS: linux 6.17.0-1011-oracle arm64
Node: 24.14.1
Install: npm global
Service: user systemd gateway on 127.0.0.1:18789
Channel: Telegram enabled (@Ant_clawd_bot)
Model: openai-codex/gpt-5.5
The systemd unit description still says OpenClaw Gateway (v2026.4.22), but the running binary reports OpenClaw 2026.4.29 (a448042) and the service command is:
/usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
Steps taken
Starting from a known-good rollback state on 2026.4.23, I tested the new stable package:
systemctl --user stop openclaw-gateway.service
pkill -f 'openclaw-tui' || true
TS=$(date +%Y%m%d-%H%M%S)
mkdir -p "$HOME/.openclaw/repair-$TS"
cp "$HOME/.openclaw/openclaw.json" "$HOME/.openclaw/repair-$TS/openclaw.json"
mv "$HOME/.openclaw/plugin-runtime-deps" "$HOME/.openclaw/plugin-runtime-deps.bak-$TS" 2>/dev/null || true
sudo npm install -g openclaw@2026.4.29
hash -r
openclaw --version
MARK=$(date '+%Y-%m-%d %H:%M:%S')
systemctl --user daemon-reload
systemctl --user start openclaw-gateway.service
Expected behavior
The gateway should stage/install bundled runtime dependencies once, converge, and remain responsive after Telegram starts. Local gateway probes and WebSocket handshakes should not time out because of runtime-deps work.
Actual behavior
The gateway reaches ready and starts Telegram:
Apr 30 21:40:46 polymarket-mc node[3675383]: 2026-04-30T21:40:46.963+00:00 [gateway] agent model: openai-codex/gpt-5.5
Apr 30 21:40:46 polymarket-mc node[3675383]: 2026-04-30T21:40:46.965+00:00 [gateway] http server listening (4 plugins: memory-core, memory-wiki, openclaw-web-search, telegram; 24.4s)
Apr 30 21:40:47 polymarket-mc node[3675383]: 2026-04-30T21:40:47.710+00:00 [gateway] ready
Apr 30 21:40:47 polymarket-mc node[3675383]: 2026-04-30T21:40:47.737+00:00 [telegram] [default] starting provider (@Ant_clawd_bot)
Apr 30 21:40:48 polymarket-mc node[3675383]: 2026-04-30T21:40:48.203+00:00 [telegram] menu text exceeded the conservative 5700-character payload budget; shortening descriptions to keep 62 commands visible.
Then Telegram fetch falls back and liveness/RPC failures begin:
Apr 30 21:41:09 polymarket-mc node[3675383]: [telegram] fetch fallback: enabling sticky IPv4-only dispatcher (codes=none, reason=request-timeout)
Apr 30 21:41:15 polymarket-mc node[3675383]: 2026-04-30T21:41:15.940+00:00 [diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu interval=33s eventLoopDelayP99Ms=12280.9 eventLoopDelayMaxMs=12280.9 eventLoopUtilization=1 cpuCoreRatio=1.075 active=0 waiting=0 queued=0
Apr 30 21:41:23 polymarket-mc node[3675383]: tools-invoke: tool execution failed: GatewayTransportError: gateway timeout after 10000ms
Apr 30 21:41:45 polymarket-mc node[3675383]: tools-invoke: tool execution failed: GatewayTransportError: gateway timeout after 10000ms
Apr 30 21:41:45 polymarket-mc node[3675383]: 2026-04-30T21:41:45.611+00:00 [ws] handshake timeout conn=de04f7bc-2a91-428e-93e7-fda0a574c05f peer=127.0.0.1:46694->127.0.0.1:18789 remote=127.0.0.1
Apr 30 21:43:32 polymarket-mc node[3675383]: 2026-04-30T21:43:32.444+00:00 [diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization interval=32s eventLoopDelayP99Ms=16894.7 eventLoopDelayMaxMs=16894.7 eventLoopUtilization=1 cpuCoreRatio=0.581 active=0 waiting=0 queued=0
Runtime-deps reinstall loop evidence
A high-CPU pnpm install child was later captured under the gateway PID:
PID PPID ELAPSED %CPU %MEM RSS STAT WCHAN COMMAND
3675383 1245 376 59.1 2.7 665876 Ssl ep_pol /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
3680105 3675383 102 47.2 5.7 1408568 Sl ep_pol node /usr/bin/pnpm install --prod --ignore-scripts --ignore-workspace --config.frozen-lockfile=false --config.minimum-release-age=0 --config.store-dir=/home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977/.openclaw-pnpm-store --config.node-linker=hoisted --config.virtual-store-dir=.pnpm
Process tree confirms the child is owned by the gateway:
systemd,1
└─systemd,1245 --user
└─node,3675383 /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
└─node,3680105 /usr/bin/pnpm install --prod --ignore-scripts --ignore-workspace ...
The pnpm child cwd is the runtime-deps root:
/home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977
The root had already grown large:
8.0G /home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977
The runtime-deps lock owner points at the gateway process:
{
"pid": 3675383,
"starttime": 35519846,
"createdAtMs": 1777585493875
}
The generated runtime-deps manifest and key package files were present:
openclaw-runtime-deps-install 34 { json5: '^2.2.3', grammy: '^1.42.0', tokenjuice: '0.7.0' }
/home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977/node_modules/json5/package.json
/home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977/node_modules/tokenjuice/package.json
/home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977/node_modules/grammy/package.json
The logs show runtime-deps staging/install activity for tokenjuice immediately before/among gateway timeouts:
Apr 30 21:42:59 polymarket-mc node[3675383]: [plugins] tokenjuice staging bundled runtime deps (34 specs): @agentclientprotocol/sdk@0.21.0, @clack/prompts@^1.2.0, @google/genai@^1.50.1, @grammyjs/runner@^2.0.3, @grammyjs/transformer-throttler@^1.2.1, @lydell/node-pty@1.2.0-beta.12, @mariozechner/pi-ai@0.70.6, @mariozechner/pi-coding-agent@0.70.6, @modelcontextprotocol/sdk@1.29.0, ajv@^8.20.0, chokidar@^5.0.0, commander@^14.0.3, croner@^10.0.1, dotenv@^17.4.2, global-agent@^4.1.3, grammy@^1.42.0, https-proxy-agent@^9.0.0, jiti@^2.6.1, json5@^2.2.3, jszip@^3.10.1, markdown-it@14.1.1, node-llama-cpp@3.18.1, openai@^6.34.0, semver@7.7.4, sqlite-vec@0.1.9, tar@7.5.13, tokenjuice@0.7.0, tslog@^4.10.2, typebox@1.1.34, undici@8.1.0, web-push@^3.6.7, ws@^8.20.0, yaml@^2.8.3, zod@^4.3.6
Apr 30 21:42:59 polymarket-mc node[3675383]: [plugins] tokenjuice installed bundled runtime deps in 9ms: @agentclientprotocol/sdk@0.21.0, @clack/prompts@^1.2.0, @google/genai@^1.50.1, @grammyjs/runner@^2.0.3, @grammyjs/transformer-throttler@^1.2.1, @lydell/node-pty@1.2.0-beta.12, @mariozechner/pi-ai@0.70.6, @mariozechner/pi-coding-agent@0.70.6, @modelcontextprotocol/sdk@1.29.0, ajv@^8.20.0, chokidar@^5.0.0, commander@^14.0.3, croner@^10.0.1, dotenv@^17.4.2, global-agent@^4.1.3, grammy@^1.42.0, https-proxy-agent@^9.0.0, jiti@^2.6.1, json5@^2.2.3, jszip@^3.10.1, markdown-it@14.1.1, node-llama-cpp@3.18.1, openai@^6.34.0, semver@7.7.4, sqlite-vec@0.1.9, tar@7.5.13, tokenjuice@0.7.0, tslog@^4.10.2, typebox@1.1.34, undici@8.1.0, web-push@^3.6.7, ws@^8.20.0, yaml@^2.8.3, zod@^4.3.6
Recurrence evidence
The pnpm install process does not appear to be one long child process. It exits and a new one appears periodically with the same command:
2026-04-30T21:57:53+00:00
3675383 1245 1053 70.3 2.9 729248 /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
3693373 3675383 0 140 0.5 143208 node /usr/bin/pnpm install --prod --ignore-scripts --ignore-workspace --config.frozen-lockfile=false --config.minimum-release-age=0 --config.store-dir=/home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977/.openclaw-pnpm-store --config.node-linker=hoisted --config.virtual-store-dir=.pnpm
2026-04-30T21:59:23+00:00
3675383 1245 1143 72.6 3.0 745460 /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
3695086 3675383 0 172 1.0 254156 node /usr/bin/pnpm install --prod --ignore-scripts --ignore-workspace --config.frozen-lockfile=false --config.minimum-release-age=0 --config.store-dir=/home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977/.openclaw-pnpm-store --config.node-linker=hoisted --config.virtual-store-dir=.pnpm
2026-04-30T21:59:28+00:00
3675383 1245 1149 72.2 3.0 745460 /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
3695086 3675383 5 160 2.6 660252 node /usr/bin/pnpm install --prod --ignore-scripts --ignore-workspace --config.frozen-lockfile=false --config.minimum-release-age=0 --config.store-dir=/home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977/.openclaw-pnpm-store --config.node-linker=hoisted --config.virtual-store-dir=.pnpm
2026-04-30T21:59:38+00:00
3675383 1245 1159 71.8 3.0 744988 /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
3695375 3675383 3 173 1.4 359216 node /usr/bin/pnpm install --prod --ignore-scripts --ignore-workspace --config.frozen-lockfile=false --config.minimum-release-age=0 --config.store-dir=/home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977/.openclaw-pnpm-store --config.node-linker=hoisted --config.virtual-store-dir=.pnpm
2026-04-30T21:59:48+00:00
3675383 1245 1169 71.7 3.0 746592 /usr/bin/node /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789
3695487 3675383 2 155 1.7 432356 node /usr/bin/pnpm install --prod --ignore-scripts --ignore-workspace --config.frozen-lockfile=false --config.minimum-release-age=0 --config.store-dir=/home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977/.openclaw-pnpm-store --config.node-linker=hoisted --config.virtual-store-dir=.pnpm
Notes
This appears separate from #75259. In this 2026.4.29 run there are no fresh copies of the original package-resolution blocker lines:
- no
Cannot find package 'json5'
- no
Cannot find package 'openclaw'
- no Telegram
channel exited
- no model fallback /
Unknown model on fresh start
The stable release fixes or bypasses the beta.4 import failure, but exposes a runtime-deps convergence/reinstall loop that can starve the gateway event loop.
Bug type
Performance / hang / runtime-deps reinstall loop / gateway health regression
Summary
After upgrading the same Ubuntu VPS that reproduced #75259 to the stable
2026.4.29package, the original beta runtime-deps import failure appears fixed: the gateway reaches ready, Telegram starts, and there are no freshCannot find package 'json5',Cannot find package 'openclaw', Telegramchannel exited, or model fallback lines.However, after the gateway reaches ready and Telegram starts, the gateway becomes event-loop starved and repeatedly spawns high-CPU
pnpm installchild processes under the2026.4.29runtime-deps root. The child process is short-lived and then reappears with a new PID. During this loop, the gateway process itself stays hot, memory grows, local RPC/status calls time out, and WebSocket handshakes fail.Important correction to the first version of this issue: an early process snapshot missed the
pnpm installchild. Later repeated sampling captured multiple gateway-ownedpnpm installchildren.Environment
The systemd unit description still says
OpenClaw Gateway (v2026.4.22), but the running binary reportsOpenClaw 2026.4.29 (a448042)and the service command is:Steps taken
Starting from a known-good rollback state on
2026.4.23, I tested the new stable package:Expected behavior
The gateway should stage/install bundled runtime dependencies once, converge, and remain responsive after Telegram starts. Local gateway probes and WebSocket handshakes should not time out because of runtime-deps work.
Actual behavior
The gateway reaches ready and starts Telegram:
Then Telegram fetch falls back and liveness/RPC failures begin:
Apr 30 21:41:09 polymarket-mc node[3675383]: [telegram] fetch fallback: enabling sticky IPv4-only dispatcher (codes=none, reason=request-timeout) Apr 30 21:41:15 polymarket-mc node[3675383]: 2026-04-30T21:41:15.940+00:00 [diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu interval=33s eventLoopDelayP99Ms=12280.9 eventLoopDelayMaxMs=12280.9 eventLoopUtilization=1 cpuCoreRatio=1.075 active=0 waiting=0 queued=0 Apr 30 21:41:23 polymarket-mc node[3675383]: tools-invoke: tool execution failed: GatewayTransportError: gateway timeout after 10000ms Apr 30 21:41:45 polymarket-mc node[3675383]: tools-invoke: tool execution failed: GatewayTransportError: gateway timeout after 10000ms Apr 30 21:41:45 polymarket-mc node[3675383]: 2026-04-30T21:41:45.611+00:00 [ws] handshake timeout conn=de04f7bc-2a91-428e-93e7-fda0a574c05f peer=127.0.0.1:46694->127.0.0.1:18789 remote=127.0.0.1 Apr 30 21:43:32 polymarket-mc node[3675383]: 2026-04-30T21:43:32.444+00:00 [diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization interval=32s eventLoopDelayP99Ms=16894.7 eventLoopDelayMaxMs=16894.7 eventLoopUtilization=1 cpuCoreRatio=0.581 active=0 waiting=0 queued=0Runtime-deps reinstall loop evidence
A high-CPU
pnpm installchild was later captured under the gateway PID:Process tree confirms the child is owned by the gateway:
systemd,1 └─systemd,1245 --user └─node,3675383 /usr/lib/node_modules/openclaw/dist/index.js gateway --port 18789 └─node,3680105 /usr/bin/pnpm install --prod --ignore-scripts --ignore-workspace ...The
pnpmchild cwd is the runtime-deps root:The root had already grown large:
The runtime-deps lock owner points at the gateway process:
{ "pid": 3675383, "starttime": 35519846, "createdAtMs": 1777585493875 }The generated runtime-deps manifest and key package files were present:
openclaw-runtime-deps-install 34 { json5: '^2.2.3', grammy: '^1.42.0', tokenjuice: '0.7.0' } /home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977/node_modules/json5/package.json /home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977/node_modules/tokenjuice/package.json /home/ubuntu/.openclaw/plugin-runtime-deps/openclaw-2026.4.29-4eca5026e977/node_modules/grammy/package.jsonThe logs show runtime-deps staging/install activity for
tokenjuiceimmediately before/among gateway timeouts:Apr 30 21:42:59 polymarket-mc node[3675383]: [plugins] tokenjuice staging bundled runtime deps (34 specs): @agentclientprotocol/sdk@0.21.0, @clack/prompts@^1.2.0, @google/genai@^1.50.1, @grammyjs/runner@^2.0.3, @grammyjs/transformer-throttler@^1.2.1, @lydell/node-pty@1.2.0-beta.12, @mariozechner/pi-ai@0.70.6, @mariozechner/pi-coding-agent@0.70.6, @modelcontextprotocol/sdk@1.29.0, ajv@^8.20.0, chokidar@^5.0.0, commander@^14.0.3, croner@^10.0.1, dotenv@^17.4.2, global-agent@^4.1.3, grammy@^1.42.0, https-proxy-agent@^9.0.0, jiti@^2.6.1, json5@^2.2.3, jszip@^3.10.1, markdown-it@14.1.1, node-llama-cpp@3.18.1, openai@^6.34.0, semver@7.7.4, sqlite-vec@0.1.9, tar@7.5.13, tokenjuice@0.7.0, tslog@^4.10.2, typebox@1.1.34, undici@8.1.0, web-push@^3.6.7, ws@^8.20.0, yaml@^2.8.3, zod@^4.3.6 Apr 30 21:42:59 polymarket-mc node[3675383]: [plugins] tokenjuice installed bundled runtime deps in 9ms: @agentclientprotocol/sdk@0.21.0, @clack/prompts@^1.2.0, @google/genai@^1.50.1, @grammyjs/runner@^2.0.3, @grammyjs/transformer-throttler@^1.2.1, @lydell/node-pty@1.2.0-beta.12, @mariozechner/pi-ai@0.70.6, @mariozechner/pi-coding-agent@0.70.6, @modelcontextprotocol/sdk@1.29.0, ajv@^8.20.0, chokidar@^5.0.0, commander@^14.0.3, croner@^10.0.1, dotenv@^17.4.2, global-agent@^4.1.3, grammy@^1.42.0, https-proxy-agent@^9.0.0, jiti@^2.6.1, json5@^2.2.3, jszip@^3.10.1, markdown-it@14.1.1, node-llama-cpp@3.18.1, openai@^6.34.0, semver@7.7.4, sqlite-vec@0.1.9, tar@7.5.13, tokenjuice@0.7.0, tslog@^4.10.2, typebox@1.1.34, undici@8.1.0, web-push@^3.6.7, ws@^8.20.0, yaml@^2.8.3, zod@^4.3.6Recurrence evidence
The
pnpm installprocess does not appear to be one long child process. It exits and a new one appears periodically with the same command:Notes
This appears separate from #75259. In this
2026.4.29run there are no fresh copies of the original package-resolution blocker lines:Cannot find package 'json5'Cannot find package 'openclaw'channel exitedUnknown modelon fresh startThe stable release fixes or bypasses the beta.4 import failure, but exposes a runtime-deps convergence/reinstall loop that can starve the gateway event loop.