fix(agent): add exponential backoff to inner streaming retry loop#7431
Open
Tranquil-Flow wants to merge 2 commits into
Open
fix(agent): add exponential backoff to inner streaming retry loop#7431Tranquil-Flow wants to merge 2 commits into
Tranquil-Flow wants to merge 2 commits into
Conversation
8c7ff68 to
fd39696
Compare
The inner streaming retry loop retried immediately after timeout/ connection errors, causing an infinite fast-retry loop when a local LLM's prefill time exceeds the stale-stream timeout. Add jittered exponential backoff (5s, 10s, 20s, capped at 30s) between attempts so the backend gets time to complete prompt processing. Uses chunked sleep with interrupt checking to stay responsive, matching the pattern used by the outer retry loop. Closes NousResearch#7069
fd39696 to
ffd36ff
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
The inner streaming retry loop retried immediately after timeout/connection errors, causing an infinite fast-retry loop when a local LLM's prefill time exceeds the stale-stream timeout (default 180s). Adds jittered exponential backoff (5s, 10s, 20s, capped at 30s) between attempts so the backend gets time to complete prompt processing.
Uses chunked sleep with interrupt checking to stay responsive, matching the pattern used by the outer retry loop.
Related Issue
Closes #7069
Type of Change
Changes Made
run_agent.py— added interruptible exponential backoff beforecontinuein the inner streaming retry loopHow to Test
Checklist
Code
fix(scope):,feat(scope):, etc.)pytest tests/ -qand all tests passDocumentation & Housekeeping
docs/, docstrings) — or N/Acli-config.yaml.exampleif I added/changed config keys — or N/ACONTRIBUTING.mdorAGENTS.mdif I changed architecture or workflows — or N/AScreenshots / Logs
N/A — see commit description and PR diff.