Skip to content

Plugin runtime-deps install loop in 2026.4.24 — gateway event loop starved, Telegram polling stalls #71818

@TARSDrakon

Description

@TARSDrakon

Summary

On OpenClaw 2026.4.24 the gateway enters an infinite npm install loop for plugin runtime-deps. The deps install reports success, but the runtime check on the next cycle still sees them as "missing" and reinstalls. This starves the gateway event loop, causing intermittent gateway HTTP/WS timeouts and Telegram getUpdates polling stalls.

Downgrading to 2026.4.21 fully resolves it — single clean gateway process, deps install once, Telegram channel comes up immediately.

Symptoms (from a real production host)

Two hosts on 2026.4.24, both observed today:

  • Gateway process alive but ws://127.0.0.1:18789 health check fails with gateway timeout after 10000ms.
  • External TLS to gateway port hangs (curl exit 35 / status 000).
  • Telegram log spams: [telegram] Polling stall detected (active getUpdates stuck for 183.83s); forcing restart every ~3 minutes.
  • ps -ef shows the gateway constantly spawning fresh npm install children for the full plugin dep set — every ~5 minutes a new install starts before the previous one's effects can be observed.

Root cause indicators

The gateway can't determine its own version. Log entries show hostname:"unknown" in the metadata. As a result, the runtime-deps directory uses two separate cache keys for the same OpenClaw install:

~/.openclaw/plugin-runtime-deps/
├── openclaw-2026.4.24-1000b8d8d749/   345M  used by main runtime  
├── openclaw-unknown-1000b8d8d749/     612K  stale orphan
└── openclaw-unknown-c1d01dbe639a/     132M  used by Telegram plugin

Inside each child dir, npm install --ignore-scripts <deps> completes successfully (installed bundled runtime deps in 4500ms) but no node_modules/ ever appears — only the OpenClaw dist/ bundle. Next runtime check sees no node_modules/grammy, declares deps missing, reinstalls. Loop never converges.

There's also a .openclaw-runtime-deps.lock/ (a directory, not a file) sitting in the 2026.4.24 cache child — looks like a stuck lock that may be contributing.

Reproduction

  1. Be on 2026.4.24 with at least one bundled-dep plugin enabled (Telegram is the obvious one).
  2. Watch ~/.openclaw/plugin-runtime-deps/ and tail -f /tmp/openclaw/openclaw-*.log.
  3. Within minutes you'll see the [plugins] <name> staging bundled runtime deps … installed bundled runtime deps pair repeating at 5-minute intervals, with no node_modules/ ever produced.

Workaround

rm -rf ~/.openclaw/plugin-runtime-deps/ + restart gateway gives ONE clean cycle (gateway comes up ready for ~4 minutes), then the loop returns. The wipe is not a fix — the bug is in the runtime-deps install path itself.

Actual fix: downgrade to 2026.4.21:

npm install -g openclaw@2026.4.21
openclaw gateway restart

Verified on the affected host: deps install once cleanly, single gateway pid, Telegram channel starting provider (@…_bot) within seconds, no polling stalls.

Environment

Related

Suggested fixes

  1. Make node_modules/ presence the source of truth for "deps installed" rather than just exit-code of npm install. The current install completes successfully but produces nothing usable.
  2. Investigate why hostname resolves to "unknown" in the gateway version detector — that's the upstream cause of the cache-key fragmentation.
  3. Convert .openclaw-runtime-deps.lock from a directory to a file (current lock-as-directory pattern can wedge if not cleaned).
  4. Add an exponential backoff or fail-fast on the dep install path so a broken install can't silently spawn every 5 minutes forever.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions