fix(gateway): staged inactivity warning before timeout escalation#6387
Merged
Conversation
Introduce gateway_timeout_warning (default 900s) as a pre-timeout alert layer. When inactivity reaches the warning threshold, a single notification is sent to the user offering to wait or reset. If inactivity continues to the gateway_timeout (default 1800s), the full timeout fires as before. This gives users a chance to intervene before work is lost on slow API providers without disabling the safety timeout entirely. Config: agent.gateway_timeout_warning in config.yaml, or HERMES_AGENT_TIMEOUT_WARNING env var (0 = disable warning).
…rning - Bind exception in warning send handler (was using stale _ne from outer scope) - Calculate remaining time until timeout correctly: (timeout - warning) // 60 instead of warning // 60 (which equals elapsed time, not remaining)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Salvage of PR #6263 by @Helmi (cherry-picked onto current main) with two bug fixes.
Adds a pre-timeout warning notification to the gateway inactivity timeout. When the agent has been idle for
gateway_timeout_warningseconds (default 900s / 15 min), a single notification is sent. If inactivity continues togateway_timeout(default 1800s / 30 min), the full timeout fires as before.This gives users on slow API providers a heads-up instead of a surprise timeout.
Bug fixes on top of the original PR
① Unbound exception variable — the original had
except Exception:followed bylogger.debug('...%s', _ne). The_nevariable wasn't bound to this exception (it existed in an outer scope from a different try/except). Fixed:except Exception as _warn_err:② Warning message remaining-time math — the original said "timed out in {_warn_mins} min" where
_warn_mins = _agent_warning // 60. That's the elapsed time, not the remaining time. With defaults (warning=15min, timeout=30min) it happened to be correct, but with warning=10min / timeout=30min it would say "timed out in 10 min" when 20 minutes remain. Fixed:_remaining_mins = (_agent_timeout - _agent_warning) // 60Changes
gateway_timeout_warning: 900in DEFAULT_CONFIGConfig
Example message at 15 min idle:
Fixes #6260