fix(run_agent): disable stale timeout for local providers (#5889)#6123
Closed
Archerouyang wants to merge 4 commits into
Closed
fix(run_agent): disable stale timeout for local providers (#5889)#6123Archerouyang wants to merge 4 commits into
Archerouyang wants to merge 4 commits into
Conversation
added 4 commits
April 8, 2026 17:40
…ct CAMOFOX_PROFILE_DIR docs - Add missing import for get_hermes_home in hindsight plugin - Remove incorrect CAMOFOX_PROFILE_DIR documentation (not a real Camofox env var) Fixes NousResearch#6098, NousResearch#6087
Remove skill file uploads from Daytona and Modal environments. Skills are loaded on the host side via skill_view(), build_skills_system_prompt(), and _load_skill_payload() - the synced files were never read inside sandboxes. Impact: - Daytona: saves ~275 seconds per session start (445 files × 2 SDK calls) - Modal: reduces sandbox creation overhead significantly Fixes NousResearch#6035
Replace Python 3.10+ union syntax (X | Y) with Optional[X] for core module that may be imported in various environments.
…ch#5889) Local providers like oMLX and Ollama may have legitimately long prefill times (300s+ for large contexts). Disable the 180s stale stream timeout for detected local providers. - Add _is_local_provider() to detect localhost/127.0.0.1/ollama URLs - Skip stale detection when timeout is infinity - Respect HERMES_STREAM_STALE_TIMEOUT if explicitly set Fixes NousResearch#5889
Contributor
|
Closed in favor of PR #6368, which fixes the same issue (#5889) using the existing |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix #5889: Local Provider Timeout Implementation Plan
Goal: Fix 180s timeout for local providers (oMLX, Ollama, etc.) so long-running local inference isn't killed prematurely
Architecture: Detect local providers by URL pattern (localhost/127.0.0.1) and disable/extend stale stream timeout for them. This is a minimal, non-breaking change that respects the existing timeout mechanism for cloud providers while allowing local models to run uninterrupted.
Tech Stack: Python, existing Hermes agent streaming infrastructure
Problem Analysis
Current behavior in
run_agent.py:4705:HERMES_STREAM_STALE_TIMEOUTdefaults to 180sFor local providers (oMLX, Ollama, llama-cpp):
Solution Design
Detect local providers and adjust stale timeout:
base_urlcontainslocalhost,127.0.0.1, or is empty (default local)_stream_stale_timeout = float('inf')or very large value for local providersHERMES_STREAM_STALE_TIMEOUTif explicitly setTask 1: Create Helper Function for Local Provider Detection
Files:
Modify:
run_agent.py(find_stream_stale_timeoutcalculation section)Step 1: Add local provider detection function
Task 2: Modify Stale Timeout Logic for Local Providers
Files:
Modify:
run_agent.py:4705-4718(stale timeout calculation)Step 3: Add local provider timeout override
Find this section (around line 4705):
Replace with: