Skip to content

Gateway CPU spin causes Telegram replies to stall and status probe to time out #72338

@midhunmonachan

Description

@midhunmonachan

Summary

The OpenClaw gateway can enter a high-CPU state where Telegram polling still appears to ingest updates, but the agent reply pipeline stalls and openclaw channels status --deep times out against the local loopback gateway. Restarting openclaw-gateway.service temporarily restores Telegram replies, but the gateway can return to high CPU shortly after restart.

I am intentionally omitting private identifiers, tokens, hostnames, absolute user paths, Telegram chat IDs, and session message contents from this report.

Environment

  • OpenClaw: 2026.4.24 (cbcfdf6)
  • Package: openclaw@2026.4.24 from npm
  • OS: Linux aarch64 / Raspberry Pi kernel family
  • Gateway mode: local loopback systemd user service
  • Service ExecStart: /usr/bin/node .../openclaw/dist/index.js gateway --port 18789
  • Service Node: /usr/bin/node reports v24.14.1
  • Shell Node on PATH: v25.9.0
  • Channels configured: Telegram enabled; WhatsApp present but not configured during the latest status check

Symptoms

  • Telegram bot token and outbound send path work. The gateway had recent successful sendMessage events before the stall.
  • Incoming Telegram messages were written to the Telegram session, but no assistant response was appended afterward.
  • The gateway process was observed at roughly one full core of CPU shortly after restart, e.g. about 100-120% CPU and hundreds of MB RSS.
  • openclaw channels status --deep timed out against ws://127.0.0.1:18789 while the process was still running.
  • Restarting the gateway temporarily made Telegram responsive again.

Sanitized status output

Checking channel status...
Gateway not reachable: gateway timeout after 10000ms
Gateway target: ws://127.0.0.1:18789
Source: local loopback
Config: <home>/.openclaw/openclaw.json
Bind: loopback
Gateway not reachable; showing config-only status.
Mode: local

- Telegram default: enabled, configured, token:<redacted>
- WhatsApp default: enabled, not configured

Possibly related trigger

Before one stall, an interactive channel login command was launched through the agent/gateway path:

openclaw channels login --channel whatsapp --verbose

That child process remained running under the gateway and consumed significant CPU. Killing that child and restarting openclaw-gateway.service restored Telegram responses. However, after a clean restart the gateway again became CPU-heavy and local channels status --deep timed out, so this may be either:

  1. an interactive channel command leaving the gateway/session runtime in a bad state,
  2. a broader gateway event-loop CPU spin, or
  3. a recovery problem when replaying/resuming channel sessions after a stuck interactive run.

Stability dumps

There were multiple stability files generated with reason: unhandled_rejection around the same period. The dumps did not contain a useful stack in the visible snapshot; they only showed metadata like process platform/arch/node version and an empty event snapshot.

Example sanitized fields:

{
  "version": 1,
  "reason": "unhandled_rejection",
  "process": {
    "platform": "linux",
    "arch": "arm64",
    "node": "25.9.0"
  },
  "snapshot": {
    "count": 0,
    "events": []
  }
}

Expected behavior

The gateway should remain responsive to local status probes and should continue processing Telegram replies, or it should fail/restart cleanly if a channel runtime or interactive command gets stuck.

Actual behavior

The gateway process remains alive but consumes high CPU, local status probes time out, and Telegram appears silent because received messages do not result in assistant replies.

Suggested diagnostics / fixes

  • Add watchdog/health detection for event-loop stalls where the gateway process is alive but loopback status does not respond.
  • Ensure interactive channel login commands cannot block or starve the shared gateway reply pipeline.
  • Capture useful stack/context in unhandled_rejection stability dumps.
  • Consider isolating channel login subprocesses from the active Telegram agent session/runtime.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions