[Bug]: Gateway main thread CPU-bound at ~100% on v2026.4.26 / current main; clean on v2026.4.22 (fs.stat storm in microtask queue)

### Bug type
Regression (worked before, now fails)

### Summary
After upgrading from `v2026.4.22` to `v2026.4.26` (also reproduces on current `main` at `9bb1e59`, which package.json reports as `2026.4.27`), the gateway sits at ~100% CPU on its single main thread and stops responding to local probes. Same host, same `~/.openclaw`, no config changes — only `git checkout` differs.

I've seen #74209 and the regression range overlaps, but on my machine the dominant signal in a CPU sample is a `fs.stat` storm in the JS microtask queue rather than bonjour. Filing separately in case the maintainer wants to triage as the same root cause or as a sibling regression.

### Versions
- macOS 26.4 (Mac Studio, ARM64), Node 22 via Homebrew
- Bad: `v2026.4.26` (`be8c246`), and current `main` at `9bb1e59` (reports as `2026.4.27`)
- Good: `v2026.4.22` (`00bd2cf`)

### Steps to reproduce
Same `~/.openclaw`, just switch versions:

```sh
git checkout v2026.4.26 && pnpm install && pnpm build
openclaw gateway restart
sleep 30
ps -o stat,%cpu,etime $(pgrep -f 'dist/index.js gateway')
# → R  95-100  sustained

git checkout v2026.4.22 && pnpm install && pnpm build
openclaw gateway restart
sleep 30
ps -o stat,%cpu,etime $(pgrep -f 'dist/index.js gateway')
# → S  2.6
```

### Side-by-side, same host
| | 4.26 / `main` | 4.22 |
|---|---|---|
| `ps STAT %CPU` | `R` 95-100 | `S` 2.6 |
| `eventLoopDelayMaxMs` (liveness warning) | up to 314 866 ms | none reported |
| `eventLoopUtilization` | 0.95–1.00 | <0.10 |
| `curl -m 3 http://127.0.0.1:18789/` | 3 s timeout | 3-9 ms |
| Discord WS lifetime | closes 1000/zombie every 60-90 s | stable |
| `openclaw gateway status` | "Connectivity probe: failed (timeout)" while runtime is "active" | OK, admin-capable |

A representative liveness warning on `main`:
```
[diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization
  interval=320s eventLoopDelayP99Ms=314874.8 eventLoopDelayMaxMs=314874.8
  eventLoopUtilization=0.999 cpuCoreRatio=0.585 active=0 waiting=0 queued=0
```
i.e. the loop blocked for over 5 minutes with nothing in the active queue.

### CPU sample on `main` (5 s, idle, no incoming traffic)
`/usr/bin/sample <gateway-pid> 5` — all 4074 stacks collapse to this single path:

```
4074 uv__io_poll
+ 4074 uv__async_io
+   4074 uv__work_done
+     4074 MakeLibuvRequestCallback<uv_fs_s>::Wrapper
+       4074 node::fs::AfterStat
+         4074 MicrotaskQueue::PerformCheckpointInternal
+           4074 MicrotaskQueue::RunMicrotasks
+             4074 Builtins_PromiseFulfillReactionJob
+               4074 AsyncFunctionAwaitResolveClosure
+                 4074 <JIT JS, unsymbolicated>
```

Every sample is in `node::fs::AfterStat`. The main thread is consumed resolving promises from a flood of `fs.stat` calls. The same sample on 4.22 with the same config and data spends almost the whole 5 s in `kqueue` waiting.

### Effect on real usage
Channel messages arrive (Discord WS frames are received), the session enters `state=processing` and ages indefinitely without a reply — I saw `agent:main:discord:direct:oliver` reach `age=1376s queueDepth=1` before the gateway watchdog killed and restarted the process. No subprocess is involved (no docker, no acpx wrapper, no model API call) — the wedge is purely in-JS work on the main thread.

### Workaround
Pin to `v2026.4.22`. Disabling the OpenClaw Auto-Update cron is recommended so the upgrade doesn't reapply.

### Related
- #74209 — overlapping regression range, same restart churn, framed around bundled-plugins / bonjour. Possibly the same root cause from a different angle.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Gateway main thread CPU-bound at ~100% on v2026.4.26 / current main; clean on v2026.4.22 (fs.stat storm in microtask queue) #74328

Bug type

Summary

Versions

Steps to reproduce

Side-by-side, same host

CPU sample on `main` (5 s, idle, no incoming traffic)

Effect on real usage

Workaround

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

	4.26 / `main`	4.22
`ps STAT %CPU`	`R` 95-100	`S` 2.6
`eventLoopDelayMaxMs` (liveness warning)	up to 314 866 ms	none reported
`eventLoopUtilization`	0.95–1.00	<0.10
`curl -m 3 http://127.0.0.1:18789/`	3 s timeout	3-9 ms
Discord WS lifetime	closes 1000/zombie every 60-90 s	stable
`openclaw gateway status`	"Connectivity probe: failed (timeout)" while runtime is "active"	OK, admin-capable

Uh oh!

[Bug]: Gateway main thread CPU-bound at ~100% on v2026.4.26 / current main; clean on v2026.4.22 (fs.stat storm in microtask queue) #74328

Description

Bug type

Summary

Versions

Steps to reproduce

Side-by-side, same host

CPU sample on main (5 s, idle, no incoming traffic)

Effect on real usage

Workaround

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

CPU sample on `main` (5 s, idle, no incoming traffic)