-
-
Notifications
You must be signed in to change notification settings - Fork 52.8k
Closed as not planned
Closed as not planned
Copy link
Labels
bugSomething isn't workingSomething isn't workingstaleMarked as stale due to inactivityMarked as stale due to inactivity
Description
Bug Description
When switching models via the webchat UI (e.g. /model opus), the gateway receives two SIGUSR1 signals in rapid succession (~2 seconds apart), causing a double restart. During this window, the webchat WebSocket disconnects twice and in-flight assistant responses can be silently lost.
Steps to Reproduce
- Open webchat UI
- Switch model (e.g. from sonnet to opus via
config.apply) - Observe gateway logs
Expected Behavior
Single clean restart: config.apply → SIGUSR1 → restart → webchat reconnects → stable.
Actual Behavior
Double restart:
config.applyat T+0 triggers reload detection → SIGUSR1 fix: add @lid format support and allowFrom wildcard handling #1- Gateway restarts, webchat disconnects (
code=1012 reason=service restart) - Gateway comes back up at T+1s
- Second SIGUSR1 at T+2s triggers another restart
- Webchat disconnects again, reconnects again
- Any assistant response in flight during this window is lost (appears as if the agent "crashed out" or the message never arrived)
Relevant Log Excerpt
2026-02-04T21:25:32.084Z [ws] config.apply 104ms
2026-02-04T21:25:32.578Z [reload] config change detected; evaluating reload
2026-02-04T21:25:32.620Z [gateway] signal SIGUSR1 received
2026-02-04T21:25:32.624Z [gateway] received SIGUSR1; restarting
2026-02-04T21:25:32.724Z [ws] webchat disconnected code=1012 reason=service restart
2026-02-04T21:25:33.015Z [gateway] agent model: anthropic/claude-opus-4-5
2026-02-04T21:25:33.015Z [gateway] listening on ws://127.0.0.1:18789
2026-02-04T21:25:34.023Z [ws] webchat connected
2026-02-04T21:25:34.153Z [gateway] signal SIGUSR1 received ← SECOND signal
2026-02-04T21:25:34.154Z [gateway] received SIGUSR1; restarting ← SECOND restart
2026-02-04T21:25:35.615Z [ws] webchat disconnected code=1012 reason=service restart
2026-02-04T21:25:35.640Z [gateway] agent model: anthropic/claude-opus-4-5
2026-02-04T21:25:36.449Z [ws] webchat connected ← finally stable
Additional Context
- Also seeing:
Config was last written by a newer OpenClaw (2026.2.1); current version is 0.0.0— possible version detection issue contributing to the double-fire. - The
config.applyRPC itself appears to both (a) send SIGUSR1 directly and (b) trigger the file-watcher reload path, which sends a second SIGUSR1. Likely needs deduplication or a debounce window. - macOS (Darwin arm64), Node v25.5.0, OpenClaw installed via npm
Suggested Fix
Debounce SIGUSR1 handling — if a restart is already in progress or was initiated within the last N seconds, ignore subsequent signals. Alternatively, ensure config.apply only triggers restart through one path (either the direct signal OR the file-watcher, not both).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingstaleMarked as stale due to inactivityMarked as stale due to inactivity