fix(windows): handle OSError from os.kill() for non-existent PIDs#12363
Open
octo-patch wants to merge 1 commit into
Open
fix(windows): handle OSError from os.kill() for non-existent PIDs#12363octo-patch wants to merge 1 commit into
octo-patch wants to merge 1 commit into
Conversation
NousResearch#12359) On Windows, os.kill(pid, 0) raises OSError (WinError 87 / errno 22) for non-existent PIDs rather than ProcessLookupError as on POSIX. The existing liveness checks only caught (ProcessLookupError, PermissionError), causing a crash when a stale gateway.pid references a dead PID. Add OSError to all four affected exception handlers: - gateway/status.py: get_running_pid() and acquire_scoped_lock() - tools/process_registry.py: _is_host_pid_alive() - gateway/run.py: --replace wait loop
4 tasks
Collaborator
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #12359
Problem
On Windows,
os.kill(pid, 0)raisesOSError: [WinError 87] ERROR_INVALID_PARAMETERwhen the target PID does not exist — unlike POSIX where it raisesProcessLookupError. The existing liveness checks only caught(ProcessLookupError, PermissionError), so a stalegateway.pidfile left behind after a crash/reboot would causehermes gateway runto crash immediately instead of cleaning up and starting fresh.The same pattern surfaced in three additional call sites that share the same POSIX-only assumption.
Solution
Add
OSErrorto the exception tuple at all four affectedos.kill(pid, 0)liveness checks, treating it identically toProcessLookupError(i.e. "process is gone, clean up"):gateway/status.py—get_running_pid()(line 578) andacquire_scoped_lock()(line 343)tools/process_registry.py—_is_host_pid_alive()(line 258)gateway/run.py—--replacewait loop (line 10364)Each change is a one-line addition. Since
ProcessLookupErroris already a subclass ofOSError, existing POSIX behavior is unchanged; only the Windows code path is affected.Testing
Verified locally on Windows 11 (Python 3.11.15, Hermes v0.10.0) with a stale
gateway.pid— gateway now starts cleanly after applying these changes.