Summary
With plugins.bundledDiscovery: "compat" (the legacy default flagged by openclaw doctor), the gateway pegs ~1 CPU core at steady state with zero active runs. CPU profile shows the time is in manifest-registry-installed, discovery, and boundary-path modules walking bundled plugin manifests in a hot loop. Setting plugins.bundledDiscovery: "allowlist" returns the gateway to ~5% CPU and 96% idle in profiles. I expect compat shouldn't have a runtime cost this large.
Environment
- OpenClaw
2026.5.7 (commit eeef486)
- Node 24.14.0 on macOS (Darwin 25.3.0, arm64)
- 9 plugins enabled, 84 disabled bundled plugins
plugins.allow set to the 9 enabled plugins
plugins.bundledDiscovery initially "compat", then "allowlist"
Observed behavior — compat mode
- Gateway uptime ~1h, no active runs (
hasActiveRun=false for all 146 sessions in sessions.list)
ps: 99.8% CPU sustained, 716 MB RSS
/readyz returned eventLoop.degraded=false (binary check), but gateway call health returned eventLoop.degraded=true with utilization=1, cpuCoreRatio≈1.0
gateway call health itself took 4.8 seconds to respond
- Under load,
delayP99Ms reached 1626 ms and delayMaxMs peaked at 84 seconds — propagated to clients as closed before connect (255 in 24h) and handshake timeout (118 in 24h)
CPU profile (V8 inspector via --inspect, 20s window, 5,059 samples, no graphs running):
34.7% (idle)
13.9% lstat (native fs)
10.7% open (native fs)
5.1% readFileUtf8 (native fs)
3.5% path.resolve
2.9% stat
2.4% existsSync
2.1% realpathSync
1.1% readJsonFileSync openclaw/json-files
0.9% isPathInside openclaw/boundary-path
0.7% resolveInstalledManifestRegistryIndexFingerprint openclaw/manifest-registry-installed
0.5% resolveBoundaryPathSync openclaw/boundary-path
0.5% resolveInstalledPackageMetadata openclaw/manifest-registry-installed
0.4% loadPluginManifest openclaw/discovery
0.4% loadInstalledPluginIndexInstallRecordsSync openclaw/manifest-registry
0.4% readTrustedPackageManifest openclaw/discovery
0.4% safeRealpathSync openclaw/discovery
0.3% readJsonObjectFileSync openclaw/manifest-registry
0.3% advanceCanonicalCursorForSegment openclaw/boundary-path
0.3% getResult openclaw/plugin-cache-primitives
37% of total time is fs syscalls (lstat + open + stat + realpath + readFile* + existsSync). The named JS frames are all in plugin-discovery/manifest-registry. Not provider plugins, not in-flight tool calls — just plugin-manifest scanning, continuously.
After — plugins.bundledDiscovery: "allowlist"
Same workload, 30s window, 8,051 samples:
96.0% (idle)
1.9% (program)
0.5% lstat
0.2% open
0.1% realpathSync
0.1% readFileUtf8
0.1% existsSync
0.0% readJsonObjectFileSync openclaw/manifest-registry (4 samples)
...all manifest-registry / discovery hot frames are gone
/readyz:
- Before:
cpuCoreRatio=1.0, delayP99Ms=1626
- After:
cpuCoreRatio≈0.05, delayP99Ms=22
gateway call health durationMs: 4826 → ~150.
Reproducer
- Install OpenClaw with the default
plugins.bundledDiscovery: "compat" (or set it explicitly).
- Configure
plugins.allow with a small subset (~9 plugins). Leave the other ~84 bundled plugins disabled but installed.
- Run the gateway with
--inspect=127.0.0.1:9229. Don't trigger any model calls.
- Profile with Chrome DevTools (or similar CDP client) for 30s.
- Observe steady-state CPU at ~100% and the manifest-registry/discovery functions dominating self-time.
openclaw config set plugins.bundledDiscovery allowlist, restart gateway, profile again. CPU drops to ~5%, hot frames are gone.
Suggested fix direction
I don't know the codebase well enough to be prescriptive, but the symptom suggests the bundled-plugin discovery in compat mode is re-walking every bundled plugin's manifest on a tight schedule (or worse, on every gateway request) instead of cache-and-invalidate-on-change. Either:
- Cache bundled-plugin discovery results across the lifetime of the gateway with explicit invalidation on plugin change, or
- Make
compat mode log a deprecation warning and shorten the discovery loop, or
- Default
bundledDiscovery to "allowlist" for new installs and let compat be opt-in for legacy configs that need it.
Worth noting that openclaw doctor already flags this with: "plugins.allow is restrictive, but bundled provider discovery is still in legacy compatibility mode. ... set plugins.bundledDiscovery to 'allowlist' after confirming omitted bundled providers are intentionally blocked." — which is exactly the right advice. The CPU cost just isn't proportional to "legacy compatibility".
Thanks
Thanks for the consistent fixes around plugin discovery / event-loop saturation in 2026.5.5–5.7 — happy to provide the raw .cpuprofile files if useful.
Summary
With
plugins.bundledDiscovery: "compat"(the legacy default flagged byopenclaw doctor), the gateway pegs ~1 CPU core at steady state with zero active runs. CPU profile shows the time is inmanifest-registry-installed,discovery, andboundary-pathmodules walking bundled plugin manifests in a hot loop. Settingplugins.bundledDiscovery: "allowlist"returns the gateway to ~5% CPU and 96% idle in profiles. I expectcompatshouldn't have a runtime cost this large.Environment
2026.5.7(commiteeef486)plugins.allowset to the 9 enabled pluginsplugins.bundledDiscoveryinitially"compat", then"allowlist"Observed behavior —
compatmodehasActiveRun=falsefor all 146 sessions insessions.list)ps: 99.8% CPU sustained, 716 MB RSS/readyzreturnedeventLoop.degraded=false(binary check), butgateway call healthreturnedeventLoop.degraded=truewithutilization=1, cpuCoreRatio≈1.0gateway call healthitself took 4.8 seconds to responddelayP99Msreached 1626 ms anddelayMaxMspeaked at 84 seconds — propagated to clients asclosed before connect(255 in 24h) andhandshake timeout(118 in 24h)CPU profile (V8 inspector via
--inspect, 20s window, 5,059 samples, no graphs running):37% of total time is fs syscalls (
lstat+open+stat+realpath+readFile*+existsSync). The named JS frames are all in plugin-discovery/manifest-registry. Not provider plugins, not in-flight tool calls — just plugin-manifest scanning, continuously.After —
plugins.bundledDiscovery: "allowlist"Same workload, 30s window, 8,051 samples:
/readyz:cpuCoreRatio=1.0,delayP99Ms=1626cpuCoreRatio≈0.05,delayP99Ms=22gateway call healthdurationMs: 4826 → ~150.Reproducer
plugins.bundledDiscovery: "compat"(or set it explicitly).plugins.allowwith a small subset (~9 plugins). Leave the other ~84 bundled plugins disabled but installed.--inspect=127.0.0.1:9229. Don't trigger any model calls.openclaw config set plugins.bundledDiscovery allowlist, restart gateway, profile again. CPU drops to ~5%, hot frames are gone.Suggested fix direction
I don't know the codebase well enough to be prescriptive, but the symptom suggests the bundled-plugin discovery in
compatmode is re-walking every bundled plugin's manifest on a tight schedule (or worse, on every gateway request) instead of cache-and-invalidate-on-change. Either:compatmode log a deprecation warning and shorten the discovery loop, orbundledDiscoveryto"allowlist"for new installs and letcompatbe opt-in for legacy configs that need it.Worth noting that
openclaw doctoralready flags this with: "plugins.allow is restrictive, but bundled provider discovery is still in legacy compatibility mode. ... set plugins.bundledDiscovery to 'allowlist' after confirming omitted bundled providers are intentionally blocked." — which is exactly the right advice. The CPU cost just isn't proportional to "legacy compatibility".Thanks
Thanks for the consistent fixes around plugin discovery / event-loop saturation in 2026.5.5–5.7 — happy to provide the raw
.cpuprofilefiles if useful.