Summary
A live main source checkout is showing two related CPU/event-loop issues:
- Running a non-existent command such as
openclaw foo does not fail fast. It loads provider plugins, loads the runtime plugin set, prints an unavailable-command error, and leaves a hot openclaw child process alive at ~90% CPU.
- The running gateway shows high sustained CPU, degraded event-loop diagnostics, and repeated provider plugin load bursts around log/model/session/control-ui surfaces.
The key failure mode is that invalid CLI input can trigger expensive plugin loading and leave a busy child process instead of returning a normal unknown-command error and exiting cleanly.
Environment
- Repo:
openclaw/openclaw
- Checkout kind: source/git checkout
- Affected local commit when reproduced:
4429ee7d2e7f6261bc5af5827e20d9566b2287da
origin/main at time of update to this issue: 359d871293e801dc9e5506b5002a4bf545c42662
- Note on version: this is a live mainline source checkout on 2026-04-30. The package/CLI banner still reports
2026.4.27, so do not interpret the original report as being limited to an old 4.27 release.
- Runtime: Node 22, systemd user gateway
- Gateway command:
node dist/index.js gateway --port 18789
Repro: non-existent CLI command
Controlled repro used a separate process group so it could be cleaned up safely:
setsid bash -lc "openclaw foo >/tmp/openclaw-direct-foo.log 2>&1" &
sleep 20
ps -eo pid,ppid,pgid,etime,stat,pcpu,command | awk -v pg="$leader" '$3==pg || $1==pg {print}'
After 20 seconds, the command still had a hot child process:
PID PPID PGID ELAPSED STAT %CPU COMMAND
1104454 1104452 1104454 00:20 Ss 0.0 bash -lc openclaw foo ...
1104481 1104454 1104454 00:19 Sl 0.8 openclaw
1104493 1104481 1104454 00:19 Rl 93.5 openclaw
The command output showed repeated plugin loading before the unknown/unavailable command error:
Config warnings:
- plugins.entries.opik-openclaw: plugin opik-openclaw: duplicate plugin id detected; global plugin will be overridden by config plugin (/home/claw/opik-openclaw/index.ts)
Config warnings:
- plugins.entries.opik-openclaw: plugin opik-openclaw: duplicate plugin id detected; global plugin will be overridden by config plugin (/home/claw/opik-openclaw/index.ts)
[plugins] loading anthropic from .../dist/extensions/anthropic/index.js
[plugins] loading byteplus from .../dist/extensions/byteplus/index.js
[plugins] loading deepseek from .../dist/extensions/deepseek/index.js
[plugins] loading moonshot from .../dist/extensions/moonshot/index.js
[plugins] loading tencent from .../dist/extensions/tencent/index.js
[plugins] loading volcengine from .../dist/extensions/volcengine/index.js
[plugins] loading xai from .../dist/extensions/xai/index.js
[plugins] loaded 7 plugin(s) (7 attempted) in 434.6ms
[plugins] loading anthropic from .../dist/extensions/anthropic/index.js
...
[plugins] loaded 7 plugin(s) (7 attempted) in 31.3ms
[plugins] loading openclaw-honcho from /home/claw/openclaw-honcho/dist/index.js
[plugins] Honcho memory plugin loaded
[plugins] loading opik-openclaw from /home/claw/opik-openclaw/index.ts
[plugins] loading acpx from .../dist/extensions/acpx/index.js
...
[plugins] loading lossless-claw from /home/claw/.openclaw/extensions/lossless-claw/dist/index.js
plugin runtime config.loadConfig() is deprecated (runtime-config-load-write); use config.current().
[plugins] loaded 120 plugin(s) (17 attempted) in 4184.6ms
[openclaw] Failed to start CLI: Error: The `openclaw foo` command is unavailable because `plugins.allow` excludes "foo". Add "foo" to `plugins.allow` if you want that bundled plugin CLI surface.
Expected behavior for openclaw foo:
- No provider catalog loads.
- No runtime plugin load.
- No background/hot child process after printing the error.
- The error should be a normal unknown-command or unavailable-command response that exits promptly.
Signal note: the operator report is that Ctrl-C/SIGINT does not stop the busy-looping openclaw foo path and SIGKILL is needed. In my controlled repro, SIGINT sent to the entire test process group did stop the tree, so the exact interrupt-resistant variant may depend on how the command is launched. The important reproduced bug is that the invalid command leaves a high-CPU child alive after printing the error; it should not require SIGINT, SIGTERM, or SIGKILL cleanup at all.
Repro: channel logs / provider reloads
From the source checkout:
pnpm openclaw gateway status --deep
pnpm openclaw channels logs --lines 80
journalctl --user -u openclaw-gateway.service --since '10 minutes ago' --no-pager
ps -p $(systemctl --user show openclaw-gateway.service -p MainPID --value) -o pid,ppid,etime,pcpu,pmem,rss,stat,command
pnpm openclaw health --json
openclaw channels logs --lines 80 prints plugin load bursts before showing log lines:
[plugins] loading anthropic from .../dist/extensions/anthropic/index.js
[plugins] loading byteplus from .../dist/extensions/byteplus/index.js
[plugins] loading deepseek from .../dist/extensions/deepseek/index.js
[plugins] loading moonshot from .../dist/extensions/moonshot/index.js
[plugins] loading tencent from .../dist/extensions/tencent/index.js
[plugins] loading volcengine from .../dist/extensions/volcengine/index.js
[plugins] loading xai from .../dist/extensions/xai/index.js
[plugins] loaded 7 plugin(s) (7 attempted) in 565.2ms
[plugins] loading anthropic from .../dist/extensions/anthropic/index.js
...
[plugins] loaded 7 plugin(s) (7 attempted) in 92.8ms
Earlier in the same current-generation gateway journal, model/session surfaces triggered much larger provider-plugin reload bursts:
[plugins] loading amazon-bedrock from .../dist/extensions/amazon-bedrock/index.js
[plugins] loading amazon-bedrock-mantle from .../dist/extensions/amazon-bedrock-mantle/index.js
[plugins] loading anthropic from .../dist/extensions/anthropic/index.js
[plugins] loading anthropic-vertex from .../dist/extensions/anthropic-vertex/index.js
...
[plugins] loading zai from .../dist/extensions/zai/index.js
[plugins] loaded 49 plugin(s) (49 attempted) in 1422.3ms
[ws] ⇄ res ✓ models.list 18353ms ...
[ws] ⇄ res ✓ models.list 7590ms ...
sessions.list also showed catalog pressure:
[gateway] sessions.list continuing without model catalog after 750ms
[ws] ⇄ res ✓ sessions.list 10155ms ...
Runtime impact observed
The gateway process stayed hot even with no active agent run:
PID ELAPSED %CPU %MEM RSS STAT COMMAND
1101846 07:12 61.8 5.4 892284 Rsl node dist/index.js gateway --port 18789
Short top samples showed the process around 70-86% CPU.
Gateway diagnostics reported event-loop degradation while active/waiting/queued work was zero:
[diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu interval=30s eventLoopDelayP99Ms=2457.9 eventLoopDelayMaxMs=5469.4 eventLoopUtilization=1 cpuCoreRatio=0.921 active=0 waiting=0 queued=0
[diagnostic] liveness warning: reasons=event_loop_delay interval=30s eventLoopDelayP99Ms=1270.9 eventLoopDelayMaxMs=1598 eventLoopUtilization=0.94 cpuCoreRatio=0.818 active=0 waiting=0 queued=0
At the same time, two Control UI clients were repeatedly polling node.list, each response taking over a second and sometimes over three seconds:
[ws] ⇄ res ✓ node.list 1480ms conn=...
[ws] ⇄ res ✓ node.list 1423ms conn=...
[ws] ⇄ res ✓ node.list 3535ms conn=...
[ws] ⇄ res ✓ node.list 3538ms conn=...
[ws] ⇄ res ✓ node.list 2026ms conn=...
[ws] ⇄ res ✓ node.list 2058ms conn=...
A health probe still eventually returned ok: true, but took ~10s:
pnpm openclaw health --json
# durationMs: 10282
# plugins.errors: []
# channels signal/telegram/whatsapp running, whatsapp healthy/linked
Expected behavior
- Unknown or unavailable CLI commands should fail before provider/plugin runtime loading.
- Invalid CLI commands should not leave hot child processes alive.
channels logs should not require loading provider/model plugins just to print recent channel logs.
models.list / sessions.list / Control UI polling should not repeatedly cold-load all provider plugins on the gateway hot path.
- Repeated UI polling should not be able to keep the main gateway loop at ~60-80% CPU when no agent run is active.
- Event-loop delay should stay low enough for gateway health/readiness and channel handling to remain responsive.
Troubleshooting notes
- This was reproduced after a fresh source install/update and gateway restart.
- The configured channels remained healthy, so this is not a simple channel crash loop.
- There is a duplicate Opik plugin warning in this environment because config intentionally overrides the global Opik plugin with a local checkout; that warning is present but does not explain the provider catalog reload bursts across model providers or the
openclaw foo hot child.
- Killing a long-running local
channels logs command stopped that CLI process, but the gateway process remained CPU hot due to ongoing Control UI node.list polling and event-loop degradation.
Suspected area
Check CLI dispatch and provider/plugin catalog loading for:
- command validation happening after plugin discovery/runtime initialization,
- plugin CLI fallback treating arbitrary words like possible plugin commands before checking allowlists/known command tables,
- child process lifecycle cleanup after CLI startup errors,
- missing request-level or process-level memoization of provider plugin catalog loads,
- expensive plugin loader calls inside
models.list, sessions.list, or log/status commands,
- duplicated Control UI polling across multiple clients/tabs,
- synchronous work on the gateway main loop during catalog/provider discovery.
Summary
A live
mainsource checkout is showing two related CPU/event-loop issues:openclaw foodoes not fail fast. It loads provider plugins, loads the runtime plugin set, prints an unavailable-command error, and leaves a hotopenclawchild process alive at ~90% CPU.The key failure mode is that invalid CLI input can trigger expensive plugin loading and leave a busy child process instead of returning a normal unknown-command error and exiting cleanly.
Environment
openclaw/openclaw4429ee7d2e7f6261bc5af5827e20d9566b2287daorigin/mainat time of update to this issue:359d871293e801dc9e5506b5002a4bf545c426622026.4.27, so do not interpret the original report as being limited to an old 4.27 release.node dist/index.js gateway --port 18789Repro: non-existent CLI command
Controlled repro used a separate process group so it could be cleaned up safely:
After 20 seconds, the command still had a hot child process:
The command output showed repeated plugin loading before the unknown/unavailable command error:
Expected behavior for
openclaw foo:Signal note: the operator report is that Ctrl-C/SIGINT does not stop the busy-looping
openclaw foopath and SIGKILL is needed. In my controlled repro, SIGINT sent to the entire test process group did stop the tree, so the exact interrupt-resistant variant may depend on how the command is launched. The important reproduced bug is that the invalid command leaves a high-CPU child alive after printing the error; it should not require SIGINT, SIGTERM, or SIGKILL cleanup at all.Repro: channel logs / provider reloads
From the source checkout:
openclaw channels logs --lines 80prints plugin load bursts before showing log lines:Earlier in the same current-generation gateway journal, model/session surfaces triggered much larger provider-plugin reload bursts:
sessions.listalso showed catalog pressure:Runtime impact observed
The gateway process stayed hot even with no active agent run:
Short
topsamples showed the process around 70-86% CPU.Gateway diagnostics reported event-loop degradation while active/waiting/queued work was zero:
At the same time, two Control UI clients were repeatedly polling
node.list, each response taking over a second and sometimes over three seconds:A health probe still eventually returned
ok: true, but took ~10s:Expected behavior
channels logsshould not require loading provider/model plugins just to print recent channel logs.models.list/sessions.list/ Control UI polling should not repeatedly cold-load all provider plugins on the gateway hot path.Troubleshooting notes
openclaw foohot child.channels logscommand stopped that CLI process, but the gateway process remained CPU hot due to ongoing Control UInode.listpolling and event-loop degradation.Suspected area
Check CLI dispatch and provider/plugin catalog loading for:
models.list,sessions.list, or log/status commands,