Skip to content

[Bug]: gateway_state.json retains stale PID after gateway restart #1631

@nidhi-singh02

Description

@nidhi-singh02

Bug Description

After restarting the gateway (e.g. hermes gateway run --replace), gateway_state.json still shows the old
process's PID instead of the new one. The gateway.pid file is correct — only gateway_state.json is
stale.

Steps to Reproduce

  1. Start the gateway: hermes gateway run
  2. Note the PID (e.g. 36459)
  3. Restart: hermes gateway run --replace
  4. Check actual PID: ps aux | grep gateway → shows new PID (e.g. 8820)
  5. Check ~/.hermes/gateway_state.json → still shows old PID 36459
  6. Check ~/.hermes/gateway.pid → correctly shows 8820

Expected Behavior

gateway_state.json should show the current gateway process PID (8820) after restart.

Actual Behavior

gateway_state.json retains the old PID (36459). gateway.pid is correct.

Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp)

Messaging Platform (if gateway-related)

Telegram

Operating System

Debian (aarch64, Raspberry Pi)

Python Version

3.11.15

Hermes Version

v0.2.0 (build 2026.3.12)

Relevant Logs / Traceback

Root Cause Analysis (optional)

Bug is in gateway/status.py lines 195-199, in write_runtime_status():

payload = _read_json_file(path) or _build_runtime_status_record()  # reads OLD file
payload.setdefault("platforms", {})
payload.setdefault("kind", _GATEWAY_KIND)
payload.setdefault("pid", os.getpid())          # <-- BUG: setdefault preserves old PID
payload.setdefault("start_time", _get_process_start_time(os.getpid()))  # <-- same bug

setdefault() means "only set if key doesn't exist." Since the old file already has a pid key with the
old value, the new PID is never written.

Compare with write_pid_file() (line 181) which calls _build_pid_record() — that always writes
os.getpid() fresh, which is why gateway.pid is correct.

Proposed Fix (optional)

Change lines 198-199 in gateway/status.py from setdefault to direct assignment:

payload["pid"] = os.getpid()
payload["start_time"] = _get_process_start_time(os.getpid())

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions