Skip to content

feat(retries): configurable max_api_retries + max_stream_retries with smarter backoff#5571

Open
iRonin wants to merge 3 commits into
NousResearch:mainfrom
iRonin:fix/configurable-api-retries
Open

feat(retries): configurable max_api_retries + max_stream_retries with smarter backoff#5571
iRonin wants to merge 3 commits into
NousResearch:mainfrom
iRonin:fix/configurable-api-retries

Conversation

@iRonin

@iRonin iRonin commented Apr 6, 2026

Copy link
Copy Markdown
Contributor

Two-Layer API Retry Strategy

Makes both the outer API retry loop and the inner stream retry loop configurable.

Configuration

```yaml

~/.hermes/config.yaml

agent:
max_api_retries: 10 # Outer loop: full API call retries (provider errors, rate limits, invalid responses)
max_stream_retries: 10 # Inner loop: transient stream/connection retries (ReadTimeout, dropped connections)
```

Retry behavior

Outer loop (max_api_retries, default 3):

  • Respects Retry-After header from API response (capped at 5 min)
  • Rate limits: exponential 5s * 2^n with +/-20% jitter, cap 5 min
  • Other errors: exponential 2^n, cap 60s

Inner loop (max_stream_retries, default 2):

  • Retries streaming request with fresh connection on ReadTimeout/connection drops
  • Rebuilds the primary OpenAI client to purge dead pool connections
  • Was: hardcoded to 2 via HERMES_STREAM_RETRIES env var only

How it works

The two layers work together -- stream retries happen first for transient connection issues. If those exhaust, the outer loop catches the error and retries the entire API call (with backoff).

Files changed

  • `cli.py`: pass `max_stream_retries` from config to AIAgent
  • `run_agent.py`: add `max_stream_retries` param, replace env var with instance attribute

iRonin added a commit to iRonin/hermes-agent-nous that referenced this pull request Apr 10, 2026
iRonin added a commit to iRonin/hermes-agent-nous that referenced this pull request Apr 10, 2026
@iRonin iRonin force-pushed the fix/configurable-api-retries branch from 787c43f to 3871bf1 Compare April 11, 2026 18:49
@iRonin iRonin changed the title feat(retries): configurable max_api_retries with smarter rate-limit backoff feat(retries): configurable max_api_retries + max_stream_retries with smarter backoff Apr 11, 2026
iRonin added 3 commits April 11, 2026 16:31
agent.max_api_retries in config.yaml (default 3, user set to 10).

Backoff improvements:
- Respects Retry-After header from API response (capped at 5 min)
- Rate limits: exponential 5s*2^n with ±20% jitter, cap 5 min
- Other errors: exponential 2^n, cap 60s
- Was: fixed min(2**n, 60) for all cases, ignored Retry-After

Usage:
  agent:
    max_api_retries: 10  # in ~/.hermes/config.yaml
…ection errors

agent.max_stream_retries in config.yaml (default 2, means 3 attempts).
Controls inner stream retry loop for ReadTimeout/connection drops.
Works alongside max_api_retries (outer loop) for two-layer retry strategy.

Usage:
  agent:
    max_api_retries: 10     # outer: full API call retries
    max_stream_retries: 5   # inner: stream/connection retries
@iRonin iRonin force-pushed the fix/configurable-api-retries branch from 8b905cb to cae15cf Compare April 11, 2026 20:37
iRonin added a commit to iRonin/hermes-agent-nous that referenced this pull request Apr 12, 2026
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder area/config Config system, migrations, profiles labels May 1, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Likely duplicate of #8486 — same configurable retry feature, #8486 is more recent and has CI-fix fork in #13519.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/config Config system, migrations, profiles comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants