feat(providers): add per-provider and per-model request_timeout_seconds config by teknium1 · Pull Request #12652 · NousResearch/hermes-agent

teknium1 · 2026-04-19T18:22:41Z

Summary

Users can set providers.<id>.request_timeout_seconds and providers.<id>.models.<model>.timeout_seconds in config.yaml to control LLM request timeouts per provider/model. Enables local-model cold-start tolerance (Ollama, K2/opus extended thinking) and fast-fail cloud retries. Zero behavior change when unset.

Salvage of #12415 — the original PR wired the knob only into the OpenAI-wire init path; live testing with timeout_seconds: 0.5 on claude-sonnet-4.6 showed the config was being silently overridden by hardcoded per-call timeout kwargs further down the stack. This PR extends the wiring to every client-construction and per-call-override site so the knob actually enforces.

Credit: @mvanhorn authored the original resolver + init hook (commit 1). Follow-up commits extend coverage.

Changes

hermes_cli/timeouts.py — resolver with per-model > provider > None precedence (@mvanhorn)
run_agent.py — wires resolved timeout into init (explicit + implicit auth), Anthropic native init + 3 credential-refresh/rebuild sites, switch_model, _restore_primary_runtime, _try_activate_fallback (OpenAI + Anthropic) with immediate client rebuild so the first fallback request carries the new timeout
run_agent.py — new _resolved_api_call_timeout() helper (config > HERMES_API_TIMEOUT env > 1800s default) wired into the non-streaming api_kwargs["timeout"] site and the streaming httpx.Timeout(connect, read, write, pool) so the per-provider config wins over the hardcoded env default
agent/anthropic_adapter.py — build_anthropic_client() gains optional timeout= kwarg (keeps 900s default on unset/invalid); connect stays at 10s
cli-config.yaml.example + website/docs/user-guide/configuration.md — documented, including Bedrock-not-covered caveat
tests/hermes_cli/test_timeouts.py — 4 resolver tests (@mvanhorn) + test_anthropic_adapter_honors_timeout_kwarg + test_resolved_api_call_timeout_priority covering all 3 precedence cases
Widened 4 existing test mock signatures (build_anthropic_client fakes) to accept the new timeout= kwarg

Coverage

api_mode	Covered	How
`chat_completions` (OpenRouter, Copilot, Nous, Kimi, Ollama, Alibaba, Qwen, MiniMax OpenAI-compat, xAI chat, custom)	✓	Client-level + per-call override on streaming and non-streaming
`anthropic_messages` (native Anthropic, MiniMax Claude, OpenCode, Z.AI)	✓	Client-level (SDK honors on `messages.create` + `messages.stream`)
`codex_responses` (OpenAI Codex, GPT-5 on OpenRouter/Copilot, xAI Responses)	✓	Client-level (no competing per-call override)
`bedrock_converse` + AnthropicBedrock SDK	✗	Documented; boto3 has its own timeout config — follow-up if needed

Validation

Live E2E against real OpenRouter:

Test	Before	After
`timeout_seconds: 60` + gpt-4o-mini + "reply PONG"	Not actually enforced	Responds PONG in 1.3s
`timeout_seconds: 0.5` + claude-sonnet-4.6 thinking essay	Silently ignored — 29-47s success	`APITimeoutError` at ~3s/retry, exhausts in 15s
Unconfigured provider	SDK default	SDK default (unchanged)
Anthropic native (`_anthropic_client.timeout.read`)	900s hardcoded	Per-config value, else 900s
Fallback chain (primary → fallback)	No timeout carried to fallback	Resolved timeout applied + client rebuilt immediately

Unit tests: scripts/run_tests.sh tests/hermes_cli/test_timeouts.py → 6/6 pass.

Full run: scripts/run_tests.sh tests/run_agent/ tests/hermes_cli/ → 3105 pass, same 6 pre-existing failures as origin/main (verified by stashing; no new regressions).

Closes #12415.

…ds config Adds optional providers.<id>.request_timeout_seconds and providers.<id>.models.<model>.timeout_seconds config, resolved via a new hermes_cli/timeouts.py helper and applied where client_kwargs is built in run_agent.py. Zero default behavior change: when both keys are unset, the openai SDK default takes over. Mirrors the existing _get_task_timeout pattern in agent/auxiliary_client.py for auxiliary tasks - the primary turn path just never got the equivalent knob. Cross-project demand: openclaw/openclaw#43946 (17 reactions) asks for exactly this config - specifically calls out Ollama cold-start hanging the client.

Follow-up on top of mvanhorn's cherry-picked commit. Original PR only wired request_timeout_seconds into the explicit-creds OpenAI branch at run_agent.py init; router-based implicit auth, native Anthropic, and the fallback chain were still hardcoded to SDK defaults. - agent/anthropic_adapter.py: build_anthropic_client() accepts an optional timeout kwarg (default 900s preserved when unset/invalid). - run_agent.py: resolve per-provider/per-model timeout once at init; apply to Anthropic native init + post-refresh rebuild + stale/interrupt rebuilds + switch_model + _restore_primary_runtime + the OpenAI implicit-auth path + _try_activate_fallback (with immediate client rebuild so the first fallback request carries the configured timeout). - tests: cover anthropic adapter kwarg honoring; widen mock signatures to accept the new timeout kwarg. - docs/example: clarify that the knob now applies to every transport, the fallback chain, and rebuilds after credential rotation.

…ry calls Live test with timeout_seconds: 0.5 on claude-sonnet-4.6 proved the initial wiring was insufficient: run_agent.py was overriding the client-level timeout on every call via hardcoded per-request kwargs. Root cause: run_agent.py had two sites that pass an explicit timeout= kwarg into chat.completions.create() — api_kwargs['timeout'] at line 7075 (HERMES_API_TIMEOUT=1800s default) and the streaming path's _httpx.Timeout(..., read=HERMES_STREAM_READ_TIMEOUT=120s, ...) at line 5760. Both override the per-provider config value the client was constructed with, so a 0.5s config timeout would silently not enforce. This commit: - Adds AIAgent._resolved_api_call_timeout() — config > HERMES_API_TIMEOUT env > 1800s default. - Uses it for the non-streaming api_kwargs['timeout'] field. - Uses it for the streaming path's httpx.Timeout(connect, read, write, pool) so both connect and read respect the configured value when set. Local-provider auto-bump (Ollama/vLLM cold-start) only applies when no explicit config value is set. - New test: test_resolved_api_call_timeout_priority covers all three precedence cases (config, env, default). Live verified: 0.5s config on claude-sonnet-4.6 now triggers APITimeoutError at ~3s per retry, exhausts 3 retries in ~15s total (was: 29-47s success with timeout ignored). Positive case (60s config + gpt-4o-mini) still succeeds at 1.3s.

…econds AWS Bedrock paths (bedrock_converse + AnthropicBedrock SDK) use boto3 with its own timeout config and are not wired to the per-provider knob. Documented in cli-config.yaml.example and website configuration.md so users don't expect it to take effect there.

mvanhorn and others added 4 commits April 19, 2026 05:32

teknium1 merged commit 6116574 into main Apr 19, 2026
5 of 8 checks passed

teknium1 deleted the hermes/hermes-6f7b0cec branch April 19, 2026 18:23

teknium1 mentioned this pull request Apr 19, 2026

feat(providers): add per-provider and per-model request_timeout_seconds config #12415

Closed

5 tasks

github-actions Bot mentioned this pull request Apr 24, 2026

chore: bump NousResearch/hermes-agent version from v2026.4.16 to v2026.4.23 Docker-Hub-sirmark/docker-hermes-agent#3

Merged

subinium mentioned this pull request Apr 27, 2026

feat(providers+gateway): per-provider per-model request_timeout_seconds subinium/CrowClaw#98

Closed

briandevans mentioned this pull request Apr 28, 2026

fix(config): accept request_timeout_seconds / stale_timeout_seconds in providers entries (#16779) #16786

Closed

3 tasks

antifragileer mentioned this pull request Jun 11, 2026

fix(agent): compress long-session context on APITimeoutError recovery (#44285) #44489

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(providers): add per-provider and per-model request_timeout_seconds config#12652

feat(providers): add per-provider and per-model request_timeout_seconds config#12652
teknium1 merged 4 commits into
mainfrom
hermes/hermes-6f7b0cec

teknium1 commented Apr 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented Apr 19, 2026

Summary

Changes

Coverage

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants