-
-
Notifications
You must be signed in to change notification settings - Fork 52.6k
Closed
Labels
bugSomething isn't workingSomething isn't workingstaleMarked as stale due to inactivityMarked as stale due to inactivity
Description
Description
When the gateway process becomes unresponsive (possibly due to network timeouts or hanging async operations), it holds onto port 18789 but stops responding to systemd's stop signals. This causes an infinite restart loop where:
- Old process holds port 18789 (zombie state)
- Systemd tries to start new instance
- New instance sees port in use → exits with code 1
- Systemd waits 5s → restart
- Repeat indefinitely (observed 3400+ restarts over ~24h)
Environment
- Moltbot version: latest (installed via git)
- Node: v22.22.0
- OS: Ubuntu 24.04 (Linux 6.8.0-90-generic)
- Systemd service: Type=simple with Restart=always
Logs
Before crash-loop, logs show repeated:
[moltbot] Suppressed AbortError: AbortError: This operation was aborted
at node:internal/deps/undici/undici:14902:13
[telegram] network error: Request to 'getUpdates' timed out after 30 seconds
[tools] cron failed: gateway timeout after 10000ms
Then crash-loop:
Gateway failed to start: gateway already running (pid XXXXX); lock timeout after 5000ms
Port 18789 is already in use.
Workaround
Added to systemd service:
TimeoutStopSec=30
KillMode=mixed
SendSIGKILL=yes
StartLimitIntervalSec=300
StartLimitBurst=10Suggested Fix
- Gateway should handle SIGTERM gracefully and release port quickly
- Consider adding a health check endpoint
- Maybe use SO_REUSEADDR or implement proper graceful shutdown
Impact
- Messages truncated mid-response ("terminated" sent to users)
- Gateway unresponsive for extended periods
- Log files grow very large (2.9MB+ of repeated error messages)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingstaleMarked as stale due to inactivityMarked as stale due to inactivity