Skip to content

[Bug]: Windows Git Bash gateway startup can crash on stale gateway.pid when os.kill(pid, 0) raises WinError 11 #9574

@EralChen

Description

@EralChen

Summary

On Windows Git Bash / MSYS, hermes gateway run can crash during duplicate-instance detection when a stale gateway.pid exists and os.kill(pid, 0) raises a plain OSError / WinError 11 instead of ProcessLookupError.

Repro

  1. On Windows, leave a stale ~/.hermes/gateway.pid behind (for example after an interrupted run).
  2. Start Hermes from Git Bash / MSYS with hermes gateway run.
  3. The startup path calls gateway.status.get_running_pid().

Observed

Startup aborts before the gateway can finish booting. In the reproduced case the traceback ended at gateway/status.py inside get_running_pid() with:

OSError: [WinError 11] 试图加载格式不正确的程序。

The current code only treats ProcessLookupError and PermissionError as stale/non-running. On this Windows path, a generic OSError also needs to be treated as a stale PID check failure.

Expected

If the PID liveness check fails with this Windows-specific OSError, Hermes should remove the stale pid file and continue startup normally.

Notes

This appears related to existing Windows stale-PID gateway reports, but this repro specifically came from Git Bash / MSYS with WinError 11 during os.kill(pid, 0) in gateway/status.py.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/gatewayGateway runner, session dispatch, deliverysweeper:implemented-on-mainSweeper: behavior already present on current maintype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions