Skip to content

feat(agent): configurable timeouts for auxiliary LLM calls via config.yaml (salvage #3406)#3597

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-f4f0292e
Mar 28, 2026
Merged

feat(agent): configurable timeouts for auxiliary LLM calls via config.yaml (salvage #3406)#3597
teknium1 merged 1 commit into
mainfrom
hermes/hermes-f4f0292e

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Adds per-task configurable timeouts for auxiliary LLM calls via config.yaml under auxiliary.{task}.timeout. Users with slow local models (Ollama, llama.cpp, oMLX) can now increase timeouts to prevent failures when auxiliary requests queue behind main generation.

Based on #3406 by @alanfwilliams — converted from env vars to config.yaml per project conventions, with authorship preserved.

Changes

hermes_cli/config.py

Added timeout to all auxiliary config sections:

  • auxiliary.compression.timeout: 120 (was hardcoded 45s — compression summarizes large contexts)
  • auxiliary.vision.timeout: 30 (already existed)
  • All other aux tasks: 30 (web_extract, session_search, skills_hub, approval, mcp, flush_memories)

agent/auxiliary_client.py

  • Added _get_task_timeout(task, default) helper — reads auxiliary.{task}.timeout from config
  • call_llm() and async_call_llm() now default timeout=None — when not explicitly passed, auto-resolves from config via the helper
  • Callers can still pass an explicit timeout=X to override

agent/context_compressor.py

  • Removed hardcoded timeout: 45.0 — now resolved from auxiliary.compression.timeout config (default 120s)

agent/title_generator.py

  • Bumped default from 15s to 30s (generous for local model queuing, still reasonable for a 30-token request)

Config example

auxiliary:
  compression:
    timeout: 180    # slow local model? increase this
  vision:
    timeout: 60

Testing

  • 1082 agent + hermes_cli tests pass
  • No config version bump needed (deep_merge carries new keys automatically)

….yaml

Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml
instead of hardcoded values. Users with slow local models (Ollama, llama.cpp)
can now increase timeouts for compression, vision, session search, etc.

Defaults:
  - auxiliary.compression.timeout: 120s (was hardcoded 45s)
  - auxiliary.vision.timeout: 30s (unchanged)
  - all other aux tasks: 30s (was hardcoded 30s)
  - title_generator: 30s (was hardcoded 15s)

call_llm/async_call_llm now auto-resolve timeout from config when not
explicitly passed. Callers can still override with an explicit timeout arg.

Based on PR #3406 by alanfwilliams. Converted from env vars to config.yaml
per project conventions.
@teknium1 teknium1 merged commit 839d9d7 into main Mar 28, 2026
4 checks passed
teknium1 added a commit that referenced this pull request Mar 28, 2026
… pages

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from #3551/#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from #3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  #3572/#3573/#3576/#3580/#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (#3583).
- skills.md: List default GitHub taps including garrytan/gstack (#3605).
teknium1 added a commit that referenced this pull request Mar 28, 2026
… pages (#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from #3551/#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from #3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  #3572/#3573/#3576/#3580/#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (#3583).
- skills.md: List default GitHub taps including garrytan/gstack (#3605).
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
….yaml (NousResearch#3597)

Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml
instead of hardcoded values. Users with slow local models (Ollama, llama.cpp)
can now increase timeouts for compression, vision, session search, etc.

Defaults:
  - auxiliary.compression.timeout: 120s (was hardcoded 45s)
  - auxiliary.vision.timeout: 30s (unchanged)
  - all other aux tasks: 30s (was hardcoded 30s)
  - title_generator: 30s (was hardcoded 15s)

call_llm/async_call_llm now auto-resolve timeout from config when not
explicitly passed. Callers can still override with an explicit timeout arg.

Based on PR NousResearch#3406 by alanfwilliams. Converted from env vars to config.yaml
per project conventions.

Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
….yaml (NousResearch#3597)

Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml
instead of hardcoded values. Users with slow local models (Ollama, llama.cpp)
can now increase timeouts for compression, vision, session search, etc.

Defaults:
  - auxiliary.compression.timeout: 120s (was hardcoded 45s)
  - auxiliary.vision.timeout: 30s (unchanged)
  - all other aux tasks: 30s (was hardcoded 30s)
  - title_generator: 30s (was hardcoded 15s)

call_llm/async_call_llm now auto-resolve timeout from config when not
explicitly passed. Callers can still override with an explicit timeout arg.

Based on PR NousResearch#3406 by alanfwilliams. Converted from env vars to config.yaml
per project conventions.

Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
….yaml (NousResearch#3597)

Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml
instead of hardcoded values. Users with slow local models (Ollama, llama.cpp)
can now increase timeouts for compression, vision, session search, etc.

Defaults:
  - auxiliary.compression.timeout: 120s (was hardcoded 45s)
  - auxiliary.vision.timeout: 30s (unchanged)
  - all other aux tasks: 30s (was hardcoded 30s)
  - title_generator: 30s (was hardcoded 15s)

call_llm/async_call_llm now auto-resolve timeout from config when not
explicitly passed. Callers can still override with an explicit timeout arg.

Based on PR NousResearch#3406 by alanfwilliams. Converted from env vars to config.yaml
per project conventions.

Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
….yaml (NousResearch#3597)

Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml
instead of hardcoded values. Users with slow local models (Ollama, llama.cpp)
can now increase timeouts for compression, vision, session search, etc.

Defaults:
  - auxiliary.compression.timeout: 120s (was hardcoded 45s)
  - auxiliary.vision.timeout: 30s (unchanged)
  - all other aux tasks: 30s (was hardcoded 30s)
  - title_generator: 30s (was hardcoded 15s)

call_llm/async_call_llm now auto-resolve timeout from config when not
explicitly passed. Callers can still override with an explicit timeout arg.

Based on PR NousResearch#3406 by alanfwilliams. Converted from env vars to config.yaml
per project conventions.

Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
….yaml (NousResearch#3597)

Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml
instead of hardcoded values. Users with slow local models (Ollama, llama.cpp)
can now increase timeouts for compression, vision, session search, etc.

Defaults:
  - auxiliary.compression.timeout: 120s (was hardcoded 45s)
  - auxiliary.vision.timeout: 30s (unchanged)
  - all other aux tasks: 30s (was hardcoded 30s)
  - title_generator: 30s (was hardcoded 15s)

call_llm/async_call_llm now auto-resolve timeout from config when not
explicitly passed. Callers can still override with an explicit timeout arg.

Based on PR NousResearch#3406 by alanfwilliams. Converted from env vars to config.yaml
per project conventions.

Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants