Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
On Windows, the in-process self-restart path (triggerOpenClawRestart → relaunchGatewayScheduledTask) fails to kill the old gateway process before launching the new one. The new gateway instance cannot bind port 18789, producing an infinite retry loop:
[gateway] already running under schtasks; waiting 5000ms before retrying startup
The root cause is that findGatewayPidsOnPortSync() in src/infra/restart-stale-pids.ts returns [] immediately on win32, so cleanStaleGatewayProcessesSync() never finds or terminates stale gateway processes.
Note: openclaw daemon restart is unaffected because it uses a separate code path (restartScheduledTask() → terminateScheduledTaskGatewayListeners()) that correctly uses the Windows-aware findVerifiedGatewayListenerPidsOnPortSync().
Steps to reproduce
- Install OpenClaw on Windows with the schtasks-based daemon supervisor.
- Start the gateway normally (
openclaw daemon start).
- Trigger an in-process self-restart (e.g., config change that fires
triggerOpenClawRestart, or SIGUSR1-equivalent restart).
- Observe the new gateway instance failing to start, retrying in a loop every 5 seconds.
Expected behavior
The self-restart path should:
- Detect the old gateway process listening on port 18789.
- Kill it using
taskkill.exe (graceful /T, then forced /F).
- Wait for the port to be released.
- Launch the new gateway, which binds successfully.
Actual behavior
findGatewayPidsOnPortSync() returns [] on Windows (early return, no port inspection), so cleanStaleGatewayProcessesSync() is a no-op. The old gateway keeps running, the new one cannot bind the port, and the schtasks supervisor enters an unbounded 5-second retry loop that never resolves.
OpenClaw version
2026.4.3 (and earlier — the return [] for win32 has been present since the function was introduced)
Operating system
Windows 11
Install method
npm global
Model
N/A — affects all configurations
Provider / routing chain
N/A — affects all configurations
Additional provider/model setup details
No response
Logs, screenshots, and evidence
# Gateway log output during the infinite loop:
[gateway] already running under schtasks; waiting 5000ms before retrying startup
[gateway] already running under schtasks; waiting 5000ms before retrying startup
[gateway] already running under schtasks; waiting 5000ms before retrying startup
...
Impact and severity
- Affected: All Windows users using the schtasks daemon supervisor with config-triggered or SIGUSR1 in-process restarts
- Severity: High — gateway becomes permanently stuck, requires manual intervention (
taskkill or Task Scheduler restart)
- Frequency: 100% reproducible on any Windows self-restart trigger
- Workaround: Use
openclaw daemon restart (which uses a different code path that works correctly)
Additional information
Proposed fix: #60480
The fix:
- Extracts Windows port/process helpers into a shared
src/infra/windows-port-pids.ts module with configurable timeoutMs
- Makes
findGatewayPidsOnPortSync discover + verify Windows gateway PIDs via PowerShell/netstat
- Adds
pollPortOnceWindows with a 400ms budget-compliant timeout for port-free polling
- Adds
terminateStaleProcessesWindows using taskkill.exe (graceful /T then forced /F)
- Breaks the circular import between
restart-stale-pids.ts and gateway-processes.ts
Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
On Windows, the in-process self-restart path (
triggerOpenClawRestart→relaunchGatewayScheduledTask) fails to kill the old gateway process before launching the new one. The new gateway instance cannot bind port 18789, producing an infinite retry loop:The root cause is that
findGatewayPidsOnPortSync()insrc/infra/restart-stale-pids.tsreturns[]immediately onwin32, socleanStaleGatewayProcessesSync()never finds or terminates stale gateway processes.Note:
openclaw daemon restartis unaffected because it uses a separate code path (restartScheduledTask()→terminateScheduledTaskGatewayListeners()) that correctly uses the Windows-awarefindVerifiedGatewayListenerPidsOnPortSync().Steps to reproduce
openclaw daemon start).triggerOpenClawRestart, or SIGUSR1-equivalent restart).Expected behavior
The self-restart path should:
taskkill.exe(graceful/T, then forced/F).Actual behavior
findGatewayPidsOnPortSync()returns[]on Windows (early return, no port inspection), socleanStaleGatewayProcessesSync()is a no-op. The old gateway keeps running, the new one cannot bind the port, and the schtasks supervisor enters an unbounded 5-second retry loop that never resolves.OpenClaw version
2026.4.3 (and earlier — the
return []for win32 has been present since the function was introduced)Operating system
Windows 11
Install method
npm global
Model
N/A — affects all configurations
Provider / routing chain
N/A — affects all configurations
Additional provider/model setup details
No response
Logs, screenshots, and evidence
Impact and severity
taskkillor Task Scheduler restart)openclaw daemon restart(which uses a different code path that works correctly)Additional information
Proposed fix: #60480
The fix:
src/infra/windows-port-pids.tsmodule with configurabletimeoutMsfindGatewayPidsOnPortSyncdiscover + verify Windows gateway PIDs via PowerShell/netstatpollPortOnceWindowswith a 400ms budget-compliant timeout for port-free pollingterminateStaleProcessesWindowsusingtaskkill.exe(graceful/Tthen forced/F)restart-stale-pids.tsandgateway-processes.ts