Skip to content

fix: local GGUF embedding model warmup blocks Node.js event loop for minutes on startup (ARM64/Pi) #75657

@DerFlash

Description

@DerFlash

Summary

When memorySearch.provider is set to "local" (the default), the Gateway initializes a GGUF embedding model via node-llama-cpp on the main Node.js thread during startup. On ARM64 hardware (Raspberry Pi), loading a 314 MB GGUF model blocks the event loop for approximately 6 minutes, during which the Gateway is completely unreachable — even though it reports active (running) in systemd.

Environment

  • OS: Raspberry Pi, Linux 6.12.75+rpt-rpi-v8, arm64/aarch64
  • Node.js: v24.14.1
  • OpenClaw: 2026.4.29 (a448042)
  • memorySearch config: provider: "local", model: hf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf (314 MB)
  • Gateway mode: systemd user service

Observed behavior

Gateway systemd service starts, but the WebSocket port is not usable for ~6 minutes:

14:58:07  systemd: Started openclaw-gateway.service
14:58:22  [gateway] loading configuration…        ← 15s just to load config
14:58:25  [gateway] starting...
             ... silence for ~6 minutes ...
15:04:04  [gateway] startup model warmup timed out after 5000ms; continuing without waiting
15:04:04  [diagnostic] liveness warning:
            reasons=event_loop_delay,event_loop_utilization,cpu
            eventLoopDelayP99Ms=18555.6
            eventLoopDelayMaxMs=18555.6            ← 18.5 second event loop block
            eventLoopUtilization=1
            cpuCoreRatio=1.05
15:04:04  [telegram] [default] starting provider
15:04:04  [gateway] ready

Time from service start to gateway ready: ~6 minutes.

During this window:

  • openclaw status reports gateway as "unreachable (timeout)"
  • openclaw tui fails with "gateway not reachable"
  • openclaw logs --follow fails immediately
  • Telegram receives no responses
  • All WebSocket connections time out

The startup model warmup timed out after 5000ms message confirms the Gateway itself has a warmup timeout, but the model initialization continues blocking the main thread regardless.

Additional context

  • CPU usage holds at ~40–60% throughout the warmup window (confirmed via ps)
  • RSS reaches ~700 MB on a Pi with 4 GB RAM during warmup
  • The lsof on the gateway PID shows fd/1 and fd/2 both point to a socket (systemd journal), not a file — ruling out log-file I/O as the cause
  • Manually testing import('...node_modules/json5/...') from the dist directory succeeds instantly, confirming earlier ERR_MODULE_NOT_FOUND errors during restarts were a race condition symptom of the warmup blocking, not a missing package

Workaround

Switch memorySearch.provider to "openai" with remote.apiKey. This eliminates the local model load entirely. Not viable for offline/airgap setups.

Suggested fix direction

  1. Run GGUF model initialization in a Worker thread (Node.js worker_threads) so the main event loop stays free
  2. Or: defer model warmup to first actual embedding request (lazy init), with a non-blocking queue for incoming requests during warmup
  3. The existing startup model warmup timed out after 5000ms mechanism already acknowledges the issue — the timeout just doesn't prevent the blocking

Impact

Any user running memorySearch.provider: "local" on ARM64/Pi (or slow x86) will experience this. Every Gateway restart (e.g. after openclaw update) causes a 6-minute outage window.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions