Feishu/Lark WebSocket drops lead to Zombie Gateway Process without auto-recovery

### Description
When using the Feishu integration, if the underlying connection suffers a `keepalive ping timeout`, the SDK's message loop exits, but the main Hermes agent process doesn't terminate or successfully reconnect. This leaves the Gateway in a zombie state where it appears "running" to system daemon managers (like systemd) but accepts no messages.

### Logs
```text
[Lark] [ERROR] receive message loop exit, err: sent 1011 (internal error) keepalive ping timeout; no close frame received
[Lark] [WARNING] ping failed, err: sent 1011 (internal error) keepalive ping timeout
```

### Expected Behavior (Crash-Only Architecture)
If the Feishu websocket loop permanently drops and cannot intrinsically reconnect, the `feishu.py` integration thread should raise a `SystemExit(1)` or bubble the exception to the parent thread. System level managers (`Restart=always`) can then forcefully respawn a healthy agent stack.

### Environment
- OS: Ubuntu 24.04 via WSL2
- Deploy type: systemd service
- Provider: feishu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feishu/Lark WebSocket drops lead to Zombie Gateway Process without auto-recovery #10616

Description

Logs

Expected Behavior (Crash-Only Architecture)

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Feishu/Lark WebSocket drops lead to Zombie Gateway Process without auto-recovery #10616

Description

Description

Logs

Expected Behavior (Crash-Only Architecture)

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions