Skip to content

[Bug]: 飞书无限重连Feishu WebSocket reconnect loop never stops — gateway becomes unresponsive without manual restart #59753

@XIAOke8698

Description

@XIAOke8698

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

Problem Description

When the Feishu WebSocket connection drops (e.g., due to Feishu server maintenance or network glitch), the Lark SDK enters a **permanent reconnect loop** with exponential backoff. The gateway process stays alive but becomes completely unresponsive — the Feishu bot goes offline and messages can neither be sent nor received. [info] [ws] unable to connect to the server after trying 890 times ← after ~24 hours

Steps to reproduce

Key observations:

  • The retry counter never resets or caps — it grows indefinitely
  • The reconnect interval grows to ~100s but never gives up
  • Feishu marks the bot as offline after prolonged disconnect
  • The openclaw channels status --probe still reports "works" even while the bot is effectively unreachable
  • Only openclaw gateway restart breaks the loop and restores connectivity

Environment

  • OpenClaw version: (2026.3.13, via npm)
  • Node.js: v22.22.1
  • OS: Linux (WSL2)
  • Feishu channel mode: WebSocket (long connection)
  • Deployment: local desktop, systemd-managed gateway

Expected behavior

Expected Behavior

According to the docs, WebSocket reconnection should use "exponential backoff with a max of 3 retries and 30s interval." In practice:

  1. The retry count has no upper limit (counter goes to 890+)
  2. There is no mechanism to force a fresh connection after many failures
  3. No watchdog or health-check triggers recovery without manual intervention

Proposed Fix

Add a watchdog mechanism at the OpenClaw gateway level:

  1. Track consecutive WebSocket reconnection failures per channel
  2. After N failures (e.g., 10–20), automatically trigger a channel re-auth (openclaw channels login) or full WebSocket session reset
  3. Consider adding a config option like channels.feishu.reconnectMaxAttempts to let users tune this behavior
  4. Alternatively, add a periodic health-check that detects "connected but non-functional" state and self-heals

Actual behavior

he retry counter never resets or caps — it grows indefinitely

  • The reconnect interval grows to ~100s but never gives up
  • Feishu marks the bot as offline after prolonged disconnect
  • The openclaw channels status --probe still reports "works" even while the bot is effectively unreachable
  • Only openclaw gateway restart breaks the loop and restores connectivity

OpenClaw version

2026.3.13

Operating system

Ubuntu(WSL2)

Install method

npm

Model

minimax 2.7

Provider / routing chain

openclaw -minimax-feishu

Additional provider/model setup details

No response

Logs, screenshots, and evidence

655:{"0":"[info]: [ 'ws', 'unable to connect to the server after trying 1 times")' ]","_meta":{"runtime":"node","runtimeVersion":"22.22.1","hostname":"unknown","name":"openclaw","date":"2026-04-01T13:41:38.156Z","logLevelId":3,"logLevelName":"INFO","path":{"fullFilePath":"file:///home/lyd/.npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js:852:46","fileName":"subsystem-D2xHvZZd.js","fileNameWithLine":"subsystem-D2xHvZZd.js:852","fileColumn":"46","fileLine":"852","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js:852","method":"console.info"}},"time":"2026-04-01T21:41:38.156+08:00"}
1659:{"0":"[info]: [ 'ws', 'unable to connect to the server after trying 2 times")' ]","_meta":{"runtime":"node","runtimeVersion":"22.22.1","hostname":"unknown","name":"openclaw","date":"2026-04-01T13:43:18.696Z","logLevelId":3,"logLevelName":"INFO","path":{"fullFilePath":"file:///home/lyd/.npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js:852:46","fileName":"subsystem-D2xHvZZd.js","fileNameWithLine":"subsystem-D2xHvZZd.js:852","fileColumn":"46","fileLine":"852","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js:852","method":"console.info"}},"time":"2026-04-01T21:43:18.696+08:00"}
1663:{"0":"[info]: [ 'ws', 'unable to connect to the server after trying 3 times")' ]","_meta":{"runtime":"node","runtimeVersion":"22.22.1","hostname":"unknown","name":"openclaw","date":"2026-04-01T13:44:59.112Z","logLevelId":3,"logLevelName":"INFO","path":{"fullFilePath":"file:///home/lyd/.npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js:852:46","fileName":"subsystem-D2xHvZZd.js","fileNameWithLine":"subsystem-D2xHvZZd.js:852","fileColumn":"46","fileLine":"852","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js:852","method":"console.info"}},"time":"2026-04-01T21:44:59.112+08:00"}
1667:{"0":"[info]: [ 'ws', 'unable to connect to the server after trying 4 times")' ]","_meta":{"runtime":"node","runtimeVersion":"22.22.1","hostname":"unknown","name":"openclaw","date":"2026-04-01T13:46:39.922Z","logLevelId":3,"logLevelName":"INFO","path":{"fullFilePath":"file:///home/lyd/.npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js:852:46","fileName":"subsystem-D2xHvZZd.js","fileNameWithLine":"subsystem-D2xHvZZd.js:852","fileColumn":"46","fileLine":"852","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js:852","method":"console.info"}},"time":"2026-04-01T21:46:39.922+08:00"}
1670:{"0":"[info]: [ 'ws', 'unable to connect to the server after trying 5 times")' ]","_meta":{"runtime":"node","runtimeVersion":"22.22.1","hostname":"unknown","name":"openclaw","date":"2026-04-01T13:48:20.262Z","logLevelId":3,"logLevelName":"INFO","path":{"fullFilePath":"file:///home/lyd/.npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js:852:46","fileName":"subsystem-D2xHvZZd.js","fileNameWithLine":"subsystem-D2xHvZZd.js:852","fileColumn":"46","fileLine":"852","filePath":".npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js","filePathWithLine":".npm-global/lib/node_modules/openclaw/dist/subsystem-D2xHvZZd.js:852","method":"console.info"}},"time":"2026-04-01T21:48:20.262+08:00"}

Impact and severity

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingregressionBehavior that previously worked and now fails

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions