Skip to content

Windows: gateway restart does not wait for active tasks and loses session state #56284

@zuowencheng

Description

@zuowencheng

GitHub Issue: Windows gateway restart does not wait for active tasks

Problem

On Windows (using Scheduled Task), openclaw gateway restart immediately returns success but does not actually restart the Gateway properly. More critically, it forcefully terminates any running tasks without waiting for them to complete, and does not recover sessions after restart.

Current Behavior

  1. Run openclaw gateway restart
  2. Command returns immediately with "success"
  3. Gateway process may still be running or restarting
  4. Any active tasks are killed abruptly
  5. Sessions are lost, no recovery after restart
  6. User must manually stop + start to get a clean restart

Expected Behavior

  1. gateway restart should use the existing graceful restart mechanism (deferGatewayRestartUntilIdle)
  2. Wait for active tasks to complete (or timeout)
  3. Write restart sentinel for session recovery
  4. Only return after Gateway is healthy and sessions are restored

Technical Details

  • OpenClaw version: 2026.3.24
  • Platform: Windows 11 (win32)
  • Installation: Windows Scheduled Task
  • Gateway functions exist but not wired:
    • deferGatewayRestartUntilIdle()
    • writeRestartSentinel()
    • consumeRestartSentinel()
    • scheduleGatewaySigusr1Restart() (Unix only)

Key finding: These functions are implemented in pi-embedded-BaSvmUpW.js but gateway restart on Windows does not call them.

Reproduction Steps

  1. Start a long-running task (e.g., a cron job or interactive session)
  2. Execute openclaw gateway restart
  3. Observe: command returns immediately
  4. Check task status: it is aborted, not completed
  5. Check session store: no recovery, task lost

Workaround

Currently must use:

openclaw gateway stop
# Wait manually
openclaw gateway start

But even this does not trigger session recovery because sentinel is not written.

Proposed Fix

Modify runDaemonRestart() in daemon-cli.js (or lifecycle-core-gBCZgGHS.js) to implement graceful restart for Windows:

async function runDaemonRestart(opts = {}) {
  // ... existing code ...

  if (process.platform === 'win32') {
    // Windows graceful restart
    const gatewayPort = await resolveGatewayLifecyclePort(service);

    // 1. Wait for active tasks to complete (or timeout)
    await deferGatewayRestartUntilIdle({
      timeoutMs: 300000 // 5 minutes
    });

    // 2. Write restart sentinel for session recovery
    await writeRestartSentinel({
      reason: 'manual restart',
      timestamp: new Date().toISOString()
    });

    // 3. Graceful stop
    await runServiceStop({ graceful: true });

    // 4. Start new instance
    await runServiceStart();

    // 5. Wait for health check
    const health = await waitForGatewayHealthyRestart({
      port: gatewayPort,
      attempts: 20,
      delayMs: 500
    });

    if (!health.success) {
      throw new Error(`Gateway restart failed: ${health.message}`);
    }

    return { outcome: 'restarted', message: 'Gateway restarted gracefully with session recovery' };
  } else {
    // Unix: send SIGUSR1 (existing behavior)
    return await scheduleGatewaySigusr1Restart({ reason: '/restart' });
  }
}

Related Files

  • daemon-cli-BgoyP3Ke.js - Gateway CLI commands
  • lifecycle-core-gBCZgGHS.js - Service lifecycle (restart, stop, start)
  • pi-embedded-BaSvmUpW.js - Contains the graceful restart functions
  • status-D8mZfs6u.js - Health check utilities (waitForGatewayHealthyRestart)

Additional Context

This issue is critical for users who run automated tasks via cron or have long-running agent sessions. Forcing abrupt termination leads to:

  • Lost task results
  • Incomplete workflows
  • Poor user experience

OpenClaw already has all the necessary infrastructure for graceful restart and session recovery; it just needs to be wired up for Windows service manager.


Documentation: See internal analysis at .proactivity/gateway-graceful-restart-best-practices.md

Tested on: OpenClaw 2026.3.24, Windows 11, Scheduled Task installation


Acceptable Solutions

  • Implement graceful restart for Windows as described above
  • Alternatively, expose deferGatewayRestartUntilIdle to CLI as --graceful flag
  • Ensure sentinel is written for all restart scenarios
  • Add tests for Windows restart behavior

Priority: High (affects core reliability)

Difficulty: Medium (functions already exist, just need to connect them)

Contributor: @YOUR_GITHUB_USERNAME (contact via OpenClaw workspace)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions