Skip to content

thv restart skips supervisor restart when PID is transiently zero #4429

@gkatz2

Description

@gkatz2

Bug description

isSupervisorProcessAlive in pkg/workloads/manager.go only checks the error from GetWorkloadPID, not the PID value. When GetWorkloadPID returns (0, nil) (which happens when ResetWorkloadPID sets process_id to 0 during transport restart), the function returns true, falsely reporting the supervisor as alive. This causes both maybeSetupContainerWorkload and maybeSetupRemoteWorkload to skip starting a new supervisor.

Steps to reproduce

  1. Start a remote MCP server with thv run
  2. Trigger a transport restart (e.g., health check failure causes proxy reconnect)
  3. During the 5-60s window where ResetWorkloadPID has set process_id to 0, run thv restart <server>
  4. The restart silently no-ops because isSupervisorProcessAlive returns true

Expected behavior

thv restart should detect that PID 0 is not a valid supervisor process and proceed with the restart.

Actual behavior

thv restart treats PID 0 as a live supervisor and returns without restarting, leaving the workload in a broken state.

Additional context

PR #4401 added pid <= 0 guards to KillProcess and FindProcess in the process package, but isSupervisorProcessAlive was missed because it reads the PID from the status file directly and doesn't call either of those functions.

The fix is a one-line change: capture the PID return value and add || pid <= 0 to the existing error check.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcliChanges that impact CLI functionalitygoPull requests that update go code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions