feat(agent): configurable timeouts for auxiliary LLM calls via config.yaml (salvage #3406)#3597
Merged
Conversation
….yaml
Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml
instead of hardcoded values. Users with slow local models (Ollama, llama.cpp)
can now increase timeouts for compression, vision, session search, etc.
Defaults:
- auxiliary.compression.timeout: 120s (was hardcoded 45s)
- auxiliary.vision.timeout: 30s (unchanged)
- all other aux tasks: 30s (was hardcoded 30s)
- title_generator: 30s (was hardcoded 15s)
call_llm/async_call_llm now auto-resolve timeout from config when not
explicitly passed. Callers can still override with an explicit timeout arg.
Based on PR #3406 by alanfwilliams. Converted from env vars to config.yaml
per project conventions.
3 tasks
teknium1
added a commit
that referenced
this pull request
Mar 28, 2026
… pages Fixes found by auditing docs against recent PRs/commits: Critical (misleading): - hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks that are now active (#3542). Add correct callback signatures. - security.md: Update tirith verdict behavior — block verdicts now go through approval flow instead of hard-blocking (#3428). Add pkill/killall self-termination guard and gateway-run backgrounding patterns (#3593). New feature docs: - configuration.md: Add tool_use_enforcement section with value table (auto/true/false/list) from #3551/#3528. - configuration.md: Expand auxiliary config with per-task timeouts (compression 120s, web_extract 30s, approval 30s) from #3597. - api-server.md: Add /v1/health alias, Security Headers section, CORS details (Max-Age, SSE headers, Idempotency-Key) from #3572/#3573/#3576/#3580/#3530. Stale/incomplete: - configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484). - environment-variables.md: Specify actual DashScope default URL. - cli-commands.md: Add alibaba to --provider list. - fallback-providers.md: Add Alibaba/DashScope to provider table. - email.md: Document noreply/automated sender filtering (#3606). - toolsets-reference.md: Add 4 missing platform toolsets — matrix, mattermost, dingtalk, api-server (#3583). - skills.md: List default GitHub taps including garrytan/gstack (#3605).
teknium1
added a commit
that referenced
this pull request
Mar 28, 2026
… pages (#3618) Fixes found by auditing docs against recent PRs/commits: Critical (misleading): - hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks that are now active (#3542). Add correct callback signatures. - security.md: Update tirith verdict behavior — block verdicts now go through approval flow instead of hard-blocking (#3428). Add pkill/killall self-termination guard and gateway-run backgrounding patterns (#3593). New feature docs: - configuration.md: Add tool_use_enforcement section with value table (auto/true/false/list) from #3551/#3528. - configuration.md: Expand auxiliary config with per-task timeouts (compression 120s, web_extract 30s, approval 30s) from #3597. - api-server.md: Add /v1/health alias, Security Headers section, CORS details (Max-Age, SSE headers, Idempotency-Key) from #3572/#3573/#3576/#3580/#3530. Stale/incomplete: - configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484). - environment-variables.md: Specify actual DashScope default URL. - cli-commands.md: Add alibaba to --provider list. - fallback-providers.md: Add Alibaba/DashScope to provider table. - email.md: Document noreply/automated sender filtering (#3606). - toolsets-reference.md: Add 4 missing platform toolsets — matrix, mattermost, dingtalk, api-server (#3583). - skills.md: List default GitHub taps including garrytan/gstack (#3605).
14 tasks
angelburgosrosado
pushed a commit
to angelburgosrosado/hermes-agent
that referenced
this pull request
Apr 27, 2026
….yaml (NousResearch#3597) Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml instead of hardcoded values. Users with slow local models (Ollama, llama.cpp) can now increase timeouts for compression, vision, session search, etc. Defaults: - auxiliary.compression.timeout: 120s (was hardcoded 45s) - auxiliary.vision.timeout: 30s (unchanged) - all other aux tasks: 30s (was hardcoded 30s) - title_generator: 30s (was hardcoded 15s) call_llm/async_call_llm now auto-resolve timeout from config when not explicitly passed. Callers can still override with an explicit timeout arg. Based on PR NousResearch#3406 by alanfwilliams. Converted from env vars to config.yaml per project conventions. Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>
angelburgosrosado
pushed a commit
to angelburgosrosado/hermes-agent
that referenced
this pull request
Apr 27, 2026
… pages (NousResearch#3618) Fixes found by auditing docs against recent PRs/commits: Critical (misleading): - hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks that are now active (NousResearch#3542). Add correct callback signatures. - security.md: Update tirith verdict behavior — block verdicts now go through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall self-termination guard and gateway-run backgrounding patterns (NousResearch#3593). New feature docs: - configuration.md: Add tool_use_enforcement section with value table (auto/true/false/list) from NousResearch#3551/NousResearch#3528. - configuration.md: Expand auxiliary config with per-task timeouts (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597. - api-server.md: Add /v1/health alias, Security Headers section, CORS details (Max-Age, SSE headers, Idempotency-Key) from NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530. Stale/incomplete: - configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484). - environment-variables.md: Specify actual DashScope default URL. - cli-commands.md: Add alibaba to --provider list. - fallback-providers.md: Add Alibaba/DashScope to provider table. - email.md: Document noreply/automated sender filtering (NousResearch#3606). - toolsets-reference.md: Add 4 missing platform toolsets — matrix, mattermost, dingtalk, api-server (NousResearch#3583). - skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
….yaml (NousResearch#3597) Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml instead of hardcoded values. Users with slow local models (Ollama, llama.cpp) can now increase timeouts for compression, vision, session search, etc. Defaults: - auxiliary.compression.timeout: 120s (was hardcoded 45s) - auxiliary.vision.timeout: 30s (unchanged) - all other aux tasks: 30s (was hardcoded 30s) - title_generator: 30s (was hardcoded 15s) call_llm/async_call_llm now auto-resolve timeout from config when not explicitly passed. Callers can still override with an explicit timeout arg. Based on PR NousResearch#3406 by alanfwilliams. Converted from env vars to config.yaml per project conventions. Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
… pages (NousResearch#3618) Fixes found by auditing docs against recent PRs/commits: Critical (misleading): - hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks that are now active (NousResearch#3542). Add correct callback signatures. - security.md: Update tirith verdict behavior — block verdicts now go through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall self-termination guard and gateway-run backgrounding patterns (NousResearch#3593). New feature docs: - configuration.md: Add tool_use_enforcement section with value table (auto/true/false/list) from NousResearch#3551/NousResearch#3528. - configuration.md: Expand auxiliary config with per-task timeouts (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597. - api-server.md: Add /v1/health alias, Security Headers section, CORS details (Max-Age, SSE headers, Idempotency-Key) from NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530. Stale/incomplete: - configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484). - environment-variables.md: Specify actual DashScope default URL. - cli-commands.md: Add alibaba to --provider list. - fallback-providers.md: Add Alibaba/DashScope to provider table. - email.md: Document noreply/automated sender filtering (NousResearch#3606). - toolsets-reference.md: Add 4 missing platform toolsets — matrix, mattermost, dingtalk, api-server (NousResearch#3583). - skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
olympus-terminal
pushed a commit
to olympus-terminal/hermes-agent
that referenced
this pull request
May 16, 2026
….yaml (NousResearch#3597) Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml instead of hardcoded values. Users with slow local models (Ollama, llama.cpp) can now increase timeouts for compression, vision, session search, etc. Defaults: - auxiliary.compression.timeout: 120s (was hardcoded 45s) - auxiliary.vision.timeout: 30s (unchanged) - all other aux tasks: 30s (was hardcoded 30s) - title_generator: 30s (was hardcoded 15s) call_llm/async_call_llm now auto-resolve timeout from config when not explicitly passed. Callers can still override with an explicit timeout arg. Based on PR NousResearch#3406 by alanfwilliams. Converted from env vars to config.yaml per project conventions. Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>
olympus-terminal
pushed a commit
to olympus-terminal/hermes-agent
that referenced
this pull request
May 16, 2026
… pages (NousResearch#3618) Fixes found by auditing docs against recent PRs/commits: Critical (misleading): - hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks that are now active (NousResearch#3542). Add correct callback signatures. - security.md: Update tirith verdict behavior — block verdicts now go through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall self-termination guard and gateway-run backgrounding patterns (NousResearch#3593). New feature docs: - configuration.md: Add tool_use_enforcement section with value table (auto/true/false/list) from NousResearch#3551/NousResearch#3528. - configuration.md: Expand auxiliary config with per-task timeouts (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597. - api-server.md: Add /v1/health alias, Security Headers section, CORS details (Max-Age, SSE headers, Idempotency-Key) from NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530. Stale/incomplete: - configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484). - environment-variables.md: Specify actual DashScope default URL. - cli-commands.md: Add alibaba to --provider list. - fallback-providers.md: Add Alibaba/DashScope to provider table. - email.md: Document noreply/automated sender filtering (NousResearch#3606). - toolsets-reference.md: Add 4 missing platform toolsets — matrix, mattermost, dingtalk, api-server (NousResearch#3583). - skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
….yaml (NousResearch#3597) Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml instead of hardcoded values. Users with slow local models (Ollama, llama.cpp) can now increase timeouts for compression, vision, session search, etc. Defaults: - auxiliary.compression.timeout: 120s (was hardcoded 45s) - auxiliary.vision.timeout: 30s (unchanged) - all other aux tasks: 30s (was hardcoded 30s) - title_generator: 30s (was hardcoded 15s) call_llm/async_call_llm now auto-resolve timeout from config when not explicitly passed. Callers can still override with an explicit timeout arg. Based on PR NousResearch#3406 by alanfwilliams. Converted from env vars to config.yaml per project conventions. Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
… pages (NousResearch#3618) Fixes found by auditing docs against recent PRs/commits: Critical (misleading): - hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks that are now active (NousResearch#3542). Add correct callback signatures. - security.md: Update tirith verdict behavior — block verdicts now go through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall self-termination guard and gateway-run backgrounding patterns (NousResearch#3593). New feature docs: - configuration.md: Add tool_use_enforcement section with value table (auto/true/false/list) from NousResearch#3551/NousResearch#3528. - configuration.md: Expand auxiliary config with per-task timeouts (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597. - api-server.md: Add /v1/health alias, Security Headers section, CORS details (Max-Age, SSE headers, Idempotency-Key) from NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530. Stale/incomplete: - configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484). - environment-variables.md: Specify actual DashScope default URL. - cli-commands.md: Add alibaba to --provider list. - fallback-providers.md: Add Alibaba/DashScope to provider table. - email.md: Document noreply/automated sender filtering (NousResearch#3606). - toolsets-reference.md: Add 4 missing platform toolsets — matrix, mattermost, dingtalk, api-server (NousResearch#3583). - skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
Egavasyug
pushed a commit
to Egavasyug/hermes-agent
that referenced
this pull request
Jun 10, 2026
….yaml (NousResearch#3597) Add per-task timeout settings under auxiliary.{task}.timeout in config.yaml instead of hardcoded values. Users with slow local models (Ollama, llama.cpp) can now increase timeouts for compression, vision, session search, etc. Defaults: - auxiliary.compression.timeout: 120s (was hardcoded 45s) - auxiliary.vision.timeout: 30s (unchanged) - all other aux tasks: 30s (was hardcoded 30s) - title_generator: 30s (was hardcoded 15s) call_llm/async_call_llm now auto-resolve timeout from config when not explicitly passed. Callers can still override with an explicit timeout arg. Based on PR NousResearch#3406 by alanfwilliams. Converted from env vars to config.yaml per project conventions. Co-authored-by: alanfwilliams <alanfwilliams@users.noreply.github.com>
Egavasyug
pushed a commit
to Egavasyug/hermes-agent
that referenced
this pull request
Jun 10, 2026
… pages (NousResearch#3618) Fixes found by auditing docs against recent PRs/commits: Critical (misleading): - hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks that are now active (NousResearch#3542). Add correct callback signatures. - security.md: Update tirith verdict behavior — block verdicts now go through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall self-termination guard and gateway-run backgrounding patterns (NousResearch#3593). New feature docs: - configuration.md: Add tool_use_enforcement section with value table (auto/true/false/list) from NousResearch#3551/NousResearch#3528. - configuration.md: Expand auxiliary config with per-task timeouts (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597. - api-server.md: Add /v1/health alias, Security Headers section, CORS details (Max-Age, SSE headers, Idempotency-Key) from NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530. Stale/incomplete: - configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484). - environment-variables.md: Specify actual DashScope default URL. - cli-commands.md: Add alibaba to --provider list. - fallback-providers.md: Add Alibaba/DashScope to provider table. - email.md: Document noreply/automated sender filtering (NousResearch#3606). - toolsets-reference.md: Add 4 missing platform toolsets — matrix, mattermost, dingtalk, api-server (NousResearch#3583). - skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds per-task configurable timeouts for auxiliary LLM calls via
config.yamlunderauxiliary.{task}.timeout. Users with slow local models (Ollama, llama.cpp, oMLX) can now increase timeouts to prevent failures when auxiliary requests queue behind main generation.Based on #3406 by @alanfwilliams — converted from env vars to config.yaml per project conventions, with authorship preserved.
Changes
hermes_cli/config.pyAdded
timeoutto all auxiliary config sections:auxiliary.compression.timeout: 120(was hardcoded 45s — compression summarizes large contexts)auxiliary.vision.timeout: 30(already existed)30(web_extract, session_search, skills_hub, approval, mcp, flush_memories)agent/auxiliary_client.py_get_task_timeout(task, default)helper — readsauxiliary.{task}.timeoutfrom configcall_llm()andasync_call_llm()now defaulttimeout=None— when not explicitly passed, auto-resolves from config via the helpertimeout=Xto overrideagent/context_compressor.pytimeout: 45.0— now resolved fromauxiliary.compression.timeoutconfig (default 120s)agent/title_generator.pyConfig example
Testing