fix(gateway): Gateway shutdown hangs causing 'PID file race lost' on restart

## Bug Description

When restarting the Hermes gateway via `systemctl restart hermes-gateway`, the gateway process sometimes hangs during shutdown and gets SIGKILL'd by systemd after `TimeoutStopSec` (default 60s). This leaves a stale PID file, causing the new gateway instance to fail with "PID file race lost to another gateway instance. Exiting."

## Steps to Reproduce

1. Configure Hermes gateway with Feishu/Lark platform adapter
2. Run `systemctl --user restart hermes-gateway`
3. If the Feishu WebSocket thread happens to be blocked (e.g., waiting for network I/O), the gateway hangs during shutdown
4. After 60 seconds, systemd sends SIGKILL
5. New instance starts but fails with "PID file race lost" error
6. Gateway enters restart loop until manually fixed

## Root Cause

The shutdown sequence in `gateway/run.py` calls `await adapter.disconnect()` for each platform adapter without a timeout. If any adapter's `disconnect()` method blocks (e.g., Feishu adapter's WebSocket thread waiting for network response), the entire shutdown process hangs.

When systemd sends SIGKILL after timeout, Python's `atexit` handlers don't run, so the PID file (`~/.hermes/gateway.pid`) is never cleaned up. The new instance sees the stale PID file and exits with "PID file race lost".

## Relevant Logs

```
Apr 23 03:21:57 python[1782979]: WARNING gateway.run: Shutdown diagnostic — other hermes processes running:
Apr 23 03:22:57 systemd[965]: hermes-gateway.service: State 'stop-sigterm' timed out. Killing.
Apr 23 03:22:57 systemd[965]: hermes-gateway.service: Killing process 1782979 (python) with signal SIGKILL.
Apr 23 03:22:57 systemd[965]: hermes-gateway.service: Failed with result 'timeout'.
Apr 23 03:22:58 python[1783144]: ERROR gateway.run: PID file race lost to another gateway instance. Exiting.
```

## Proposed Fix

Add a timeout wrapper around `adapter.disconnect()` in the shutdown sequence:

```python
_adapter_disconnect_timeout = 15.0  # seconds per adapter
for platform, adapter in list(self.adapters.items()):
    try:
        await asyncio.wait_for(adapter.disconnect(), timeout=_adapter_disconnect_timeout)
        logger.info("✓ %s disconnected", platform.value)
    except asyncio.TimeoutError:
        logger.warning(
            "✗ %s disconnect timed out after %.1fs - forcing continue",
            platform.value, _adapter_disconnect_timeout
        )
```

This ensures the shutdown sequence always completes within a reasonable time, allowing PID file cleanup to run properly.

## Environment

- Hermes Agent version: latest main branch
- Platform: Feishu/Lark
- OS: Linux (systemd user service)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(gateway): Gateway shutdown hangs causing 'PID file race lost' on restart #14128

Bug Description

Steps to Reproduce

Root Cause

Relevant Logs

Proposed Fix

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

fix(gateway): Gateway shutdown hangs causing 'PID file race lost' on restart #14128

Description

Bug Description

Steps to Reproduce

Root Cause

Relevant Logs

Proposed Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions