Error classifier: 429 'temporarily overloaded' misclassified as rate_limit — triggers wrong recovery path

## Summary

The error classifier treats all HTTP 429 responses as `FailoverReason.rate_limit`, regardless of whether the 429 indicates a per-key rate limit or a server-side overload. This causes the wrong recovery strategy to be used.

Some providers (e.g. Z.AI/Zhipu) return HTTP 429 with messages like:

```
HTTP 429: The service may be temporarily overloaded, please try again later
```

This is a **server-side overload** — the entire provider endpoint is struggling, not just this API key hitting a per-key quota. The recovery strategy should be different:

| Reason | Correct Behavior |
|---|---|
| `rate_limit` | Retry same credential once, then rotate to next key |
| `overloaded` | Skip retry, rotate immediately (the whole provider is down) |

## Current Behavior

`agent/error_classifier.py` (line ~551):

```python
if status_code == 429:
    return result_fn(
        FailoverReason.rate_limit,  # ← always rate_limit
        retryable=True,
        should_rotate_credential=True,
        should_fallback=True,
    )
```

The message body is not inspected to distinguish overload from rate limiting.

Additionally, `FailoverReason.overloaded` exists as an enum value but is never produced by the 429 classification path, and `_handle_credential_failover()` in `run_agent.py` has no handler for it — it falls through to the default no-op return.

## Proposed Fix

1. In `error_classifier.py`: inspect the error message for overload patterns (`"temporarily overloaded"`, `"server is overloaded"`, `"capacity"`, etc.) and classify as `FailoverReason.overloaded` instead of `rate_limit`
2. In `run_agent.py`: add an `overloaded` handler in `_handle_credential_failover()` that skips the retry-on-same-credential step and rotates immediately (same behavior as `billing`)

## Environment

- Hermes-agent latest
- Observed with Z.AI provider returning `HTTP 429: "The service may be temporarily overloaded, please try again later"`
- No `retry_after` header or `resets_at` field in the error response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error classifier: 429 'temporarily overloaded' misclassified as rate_limit — triggers wrong recovery path #15297

Summary

Current Behavior

Proposed Fix

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Reason	Correct Behavior
`rate_limit`	Retry same credential once, then rotate to next key
`overloaded`	Skip retry, rotate immediately (the whole provider is down)

Error classifier: 429 'temporarily overloaded' misclassified as rate_limit — triggers wrong recovery path #15297

Description

Summary

Current Behavior

Proposed Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions