[Bug] Telegram adapter auto-pause never auto-recovers after transient DNS failure

## Bug Description

Gateway uses DoH (DNS-over-HTTPS) fallback to resolve `api.telegram.org`, which works well when DoH providers (Google/Cloudflare) are reachable. However, when **both** system DNS and DoH providers are blocked/unreachable, the fallback chain degrades as follows:

1. System DNS (`socket.getaddrinfo`) fails → `getaddrinfo failed` reported in logs
2. DoH to `dns.google` and `cloudflare-dns.com` also fails (network-level block)
3. Falls back to hardcoded seed IP `149.154.167.220`
4. TCP connection to seed IP also fails
5. After 10 consecutive failures, Telegram adapter is **auto-paused** (`gateway/run.py:_PAUSE_AFTER_FAILURES=10`)
6. Paused adapter stops all retry attempts permanently; requires manual `/platform resume telegram` to recover

**The core problem**: When the network recovers (DNS resolves again), the adapter remains permanently paused and never auto-recovers. The user must manually run `/platform resume telegram` or restart the gateway — which is not obvious from the error message.

## Steps to Reproduce

1. Run `hermes gateway` with Telegram adapter configured
2. Simulate DNS failure (e.g., block port 53, or use a network that has no DNS resolution for `api.telegram.org`)
3. Observe logs:
```
DoH discovery yielded no usable IPs (system DNS: unknown); using seed fallback IPs 149.154.167.220
Primary api.telegram.org connection failed ([Errno 11001] getaddrinfo failed); trying fallback IPs 149.154.167.220
Fallback IP 149.154.167.220 failed: All connection attempts failed
```
4. After 10 attempts: `telegram paused after 10 consecutive failures (telegram connect timed out after 30s) — fix the underlying issue then run `/platform resume telegram` to retry`
5. When network recovers (DNS resolves again), the adapter stays paused — no auto-recovery

## Expected Behavior

When network connectivity recovers (system DNS can resolve `api.telegram.org` again), the Telegram adapter should **automatically reconnect** without manual intervention. The circuit breaker (pause after 10 failures) should only stop hammering a permanently failed endpoint, not a temporarily unreachable one that has since recovered.

## Actual Behavior

- Telegram adapter goes into `paused` state after 10 consecutive failures
- It stays paused even after network recovers
- User receives misleading error message: "fix the underlying issue then run `/platform resume telegram`" even when the underlying issue (DNS failure) has already been resolved
- Requires manual `/platform resume telegram` or gateway restart to recover

## Root Cause Analysis

**File**: `gateway/run.py` lines 5500-5501 and 2603-2638

```python
_BACKOFF_CAP = 300  # 5 minutes max between retries
_PAUSE_AFTER_FAILURES = 10  # circuit-breaker threshold
```

The `_pause_failed_platform()` method sets `info["paused"] = True` and pushes `next_retry` to `now + 300s`. The reconnect watcher (`_platform_reconnect_watcher`) skips platforms that are paused:

```python
# gateway/run.py — reconnect watcher loop
if info.get("paused"):
    # circuit breaker: don't hammer a known-bad platform
    continue
```

**The logic gap**: The circuit breaker correctly stops hammering a failed endpoint, but it never detects when the endpoint becomes reachable again. On a machine behind a GFW, the network may be flaky — DNS fails for minutes, then recovers, but the adapter never wakes up.

## Proposed Fix

When a platform is in `paused` state, the reconnect watcher should still periodically poll system DNS to detect if the endpoint has become reachable again. Specifically:

1. Add a DNS probe phase for paused platforms (e.g., every 5 minutes) that checks if the platform's host can be resolved
2. If system DNS resolves successfully, auto-resume the platform (reset attempt counter, schedule immediate reconnect)
3. This is a targeted fix — the circuit breaker still protects against hammering a permanently unreachable endpoint, but recovered endpoints auto-heal

**Affected file**: `gateway/run.py` — `_platform_reconnect_watcher()` method

## OS / Environment

- OS: Windows 10 (native, Git Bash / MSYS shell)
- Hermes version: latest (as of May 30, 2026)
- Telegram adapter with no proxy configured
- Network: ISP-level DNS occasionally fails for `api.telegram.org`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Telegram adapter auto-pause never auto-recovers after transient DNS failure #35284

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Proposed Fix

OS / Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug] Telegram adapter auto-pause never auto-recovers after transient DNS failure #35284

Description

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause Analysis

Proposed Fix

OS / Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions