Skip to content

fix(gateway): retry detached Windows restart watcher without breakaway#42242

Open
worlldz wants to merge 2 commits into
NousResearch:mainfrom
worlldz:fix-windows-gateway-restart-job-fallback
Open

fix(gateway): retry detached Windows restart watcher without breakaway#42242
worlldz wants to merge 2 commits into
NousResearch:mainfrom
worlldz:fix-windows-gateway-restart-job-fallback

Conversation

@worlldz

@worlldz worlldz commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Fixes #42116

On Windows, the detached gateway restart watcher could fail with WinError 5 / access denied when the parent process was running inside a job object that disallowed CREATE_BREAKAWAY_FROM_JOB.

That made the /restart detached watcher path brittle in restrictive environments: the first detached spawn could fail before the watcher ever got a chance to wait for the old gateway to exit and respawn it.

This patch hardens both layers of the Windows restart chain:

  • the outer watcher spawn now retries without CREATE_BREAKAWAY_FROM_JOB if the initial detached Popen is denied
  • the inlined watcher’s final respawn of hermes gateway restart now uses the same breakaway-first, fallback-without-breakaway pattern

This keeps the preferred breakaway behavior where it is allowed, but no longer gives up when the parent job object rejects it.

Regression coverage added for:

  • the gateway restart watcher retrying without breakaway on Windows when the outer detached spawn is denied
  • source-level guards that the inlined Windows watcher keeps both the breakaway flag and the no-breakaway fallback
  • existing restart / Windows detach behavior staying intact

Validation:
uv run --extra dev pytest tests/gateway/test_restart_drain.py tests/tools/test_windows_native_support.py -q

Result:
82 passed in 6.40s

Validation:
uv run --extra dev pytest tests/gateway/test_update_command.py tests/hermes_cli/test_gateway_service.py -q -k 'resolve_hermes_bin or restart'

Result:
17 passed, 163 deselected in 3.73s

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/gateway Gateway runner, session dispatch, delivery labels Jun 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(gateway/run): Windows detached gateway /restart fails in job objects (WinError 5)

2 participants