Skip to content

Telegram channel stuck not-running with restartPending=true while Bot API probe succeeds; gateway restart kills stale process and recovers #90901

@Tony-ooo

Description

@Tony-ooo

Summary

Telegram stopped responding even though the Telegram Bot API credential probe succeeded. The Gateway itself was reachable and healthy, but the Telegram channel runtime was stuck in running=false / healthState=not-running / restartPending=true.

Running openclaw gateway restart recovered the channel. During restart, OpenClaw reported that it killed one stale gateway process before restarting the LaunchAgent.

This looks different from #90525: there was no stale Telegram direct session with status=running and hasActiveRun=false.

Environment

  • OpenClaw: 2026.6.1 (2e08f0f)
  • Install/runtime: native macOS LaunchAgent-managed Gateway
  • OS/runtime: macOS 15.7.5 arm64
  • Node: v25.4.0
  • Channel: Telegram Bot
  • Telegram mode: polling
  • Gateway bind: local/LaunchAgent-managed setup

Observed behavior

Telegram stopped replying. A read-only channel health check showed that Telegram was configured and its Bot API probe succeeded, but the channel runtime itself was not running:

{
  "configured": true,
  "running": false,
  "lastStopAt": null,
  "lastError": null,
  "probe": {
    "ok": true,
    "elapsedMs": 1973
  },
  "mode": "polling"
}

The account-level status showed the same lifecycle inconsistency:

{
  "accountId": "default",
  "enabled": true,
  "configured": true,
  "running": false,
  "connected": true,
  "restartPending": true,
  "reconnectAttempts": 0,
  "lastError": null,
  "mode": "polling",
  "healthState": "not-running"
}

Gateway health itself was OK and the Telegram plugin was loaded:

{
  "ok": true,
  "plugins": {
    "loaded": [
      "telegram"
    ],
    "errors": []
  },
  "channels": {
    "telegram": {
      "enabled": true,
      "configured": true,
      "running": false,
      "connected": true,
      "restartPending": true,
      "reconnectAttempts": 0,
      "lastError": null,
      "mode": "polling",
      "healthState": "not-running"
    }
  }
}

I also checked whether this matched #90525. It did not:

{
  "tasks": {
    "active": 0,
    "byStatus": {
      "queued": 0,
      "running": 0
    }
  },
  "telegramDirectSessions": 1,
  "telegramDirectByStatus": {
    "done": 1
  },
  "staleTelegramDirectRunningNoActiveRun": 0
}

Recovery

Running a Gateway restart recovered the Telegram channel:

[restart] killing 1 stale gateway process(es) before restart
Restarted LaunchAgent: gui/501/ai.openclaw.gateway

After restart:

{
  "configured": true,
  "running": true,
  "lastStopAt": null,
  "lastError": null,
  "probe": {
    "ok": true,
    "elapsedMs": 424
  },
  "mode": "polling"
}

Account-level status also cleared restartPending:

{
  "accountId": "default",
  "enabled": true,
  "configured": true,
  "running": true,
  "restartPending": false,
  "reconnectAttempts": 0,
  "lastError": null,
  "mode": "polling"
}

The user confirmed Telegram recovered normally after this restart.

Expected behavior

If Telegram Bot API probe succeeds and the plugin is loaded, the Gateway should either:

  • start/restart the Telegram channel runner automatically, or
  • report a concrete actionable error explaining why the channel is not-running.

The channel should not remain indefinitely in restartPending=true with running=false, lastError=null, and reconnectAttempts=0 until a manual Gateway restart clears a stale process.

Actual behavior

The Gateway remained healthy and the Bot API probe succeeded, but the Telegram channel runtime stayed not-running with restartPending=true. Telegram replies did not recover until a manual Gateway restart killed a stale gateway process.

Why this seems like a Gateway/channel lifecycle bug

This does not look like a simple Telegram/network outage:

  • Bot API probe succeeded.
  • lastError stayed null.
  • reconnectAttempts stayed 0.
  • The Telegram plugin was loaded.
  • Manual Gateway restart cleared restartPending and restored service.
  • Restart reported a stale gateway process.

This suggests a Gateway/channel lifecycle or LaunchAgent process-state inconsistency: the channel needed a restart, but the runtime did not converge back to a running state.

Possible fix direction

  • Reconcile channel account lifecycle state on Gateway startup and after channel reload attempts.
  • If an account is enabled/configured and restartPending=true, ensure the runner is actually restarted or mark the restart as failed with an actionable lastError.
  • Detect and surface stale Gateway process/process-manager state in openclaw status, openclaw channels status, or openclaw doctor.
  • Avoid a state where probe.ok=true, running=false, restartPending=true, and lastError=null persists indefinitely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High-priority user-facing bug, regression, or broken workflow.clawsweeper:needs-live-reproClawSweeper needs live local, crabbox, or manual validation to confirm this issue.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.impact:crash-loopCrash, hang, restart loop, or process-level availability failure.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.issue-rating: 🐚 platinum hermitGood issue quality with a plausible reproduction path needing some confirmation.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions