fix: add circuit breaker for infinite tool-call loops#12632
Closed
gzsiang wants to merge 33 commits into
Closed
Conversation
a0e7c45 to
06d29c7
Compare
When the database contains non-UTF-8 binary data (e.g. from image attachments or binary tool outputs), FastAPI's jsonable_encoder fails with UnicodeDecodeError when serializing session data for the web UI. This adds a _sanitize_for_json() helper that recursively converts bytes to UTF-8 strings (or base64 as fallback) before JSON serialization, applied to all session-related API endpoints.
…(optional via config)
- Translate severity levels (HIGH→高危, CRITICAL→致命, etc.) - Translate common rule titles (Pipe to interpreter→管道传输到解释器, etc.) - Translate 'Security scan' prefix to '安全扫描' - Add rule_id-based lookup for comprehensive rule coverage - Fallback to title string matching for rules not in the rule map - Respects approvals.language config (zh/en)
Add a circuit breaker mechanism in _invoke_tool() and the sequential execution path to detect and break infinite tool-call loops. When the same tool with identical arguments is called 3 consecutive times, the circuit breaker triggers and returns an error, forcing the LLM to try a different strategy. The breaker uses MD5 hashing of (tool_name, json_args) to create a stable signature, then checks if the last N calls all have the same signature. After triggering, the counter resets so future calls can succeed. - Added _consecutive_tool_calls and _circuit_breaker_threshold to AIAgent.__init__ - Added circuit breaker logic to _invoke_tool() (concurrent path) - Added circuit breaker logic to _execute_tool_calls_sequential() (sequential path) - All 22 existing tests pass
…shold - Change error message from Chinese to English for upstream compatibility - Increase default threshold from 3 to 5 to avoid false positives (legitimate retry scenarios exist, e.g. network jitter) - Fix indentation bug in sequential execution path - All 22 existing tests pass
…elp text - Add _DESCRIPTION_ZH and _CATEGORY_ZH dicts in commands.py - Add _desc_zh() and _cat_zh() helpers for runtime localization - Localize COMMANDS, COMMANDS_BY_CATEGORY, gateway_help_lines(), telegram_bot_commands() - Add HELP_ZH and HOTKEYS_ZH constants in core.ts for web UI - Add BRAND_ZH and getBrand() in theme.ts for branding strings - Language detection from config.yaml (approvals.language or language key) - Backward compatible: English defaults when language != 'zh'
- banner.py: Chinese translations for welcome banner text (tools, skills, MCP servers, update messages) - skin_engine.py: new 'chinese' skin with gold theme, auto-select based on approvals.language config - tips.py: full Chinese tips corpus (TIPS_ZH) with language detection
Prevents the LLM from retrying different variations of a failing tool call. When a tool returns errors/empty results N times consecutively (regardless of argument changes), the counter triggers and forces a strategy change. - Added _tool_failure_count dict and _tool_failure_threshold (5) in __init__ - Added _check_tool_failure() method to detect failures and update counter - Added check calls in both concurrent and sequential execution paths - Each tool has independent counter; success resets counter to 0
- Remove erroneous _check_tool_failure call in _invoke_tool that referenced undefined variable function_result - Add temperature field to auxiliary.vision config (default: 0.1) - Fixes 8 failing tests in concurrent tool execution and camofox vision
- Move _check_tool_failure calls to AFTER each tool execution - Adds failure retry counter check for all tool types (todo, session_search, memory, memory manager tools, clarify, delegate_task, and generic tools) - Prevents LLM from retrying different parameter variations of failing tools - Complements the circuit breaker which detects identical tool+args loops
- Add read_timestamps to _read_tracker initialization (file_tools.py) - Fix camofox vision test mock path (test_browser_camofox.py) - Catch ModuleNotFoundError in browser_tool camofox import - Fix _CAMEL_ALIASES apiKey mapping in config.py - Use os.path.join for hermes_home/.env path (file_safety.py) Fixes 8 failing tests across 6 test files.
- Fix mock paths in test_zombie_process_cleanup.py (use run_agent.* instead of tools.*) - Fix mock path in test_agent_cache.py (mock run_agent.cleanup_vm instead of _tt.cleanup_vm) - Use get_hermes_home() in build_write_denied_paths for profile-aware path resolution - Add ModuleNotFoundError handling in browser_tool camofox and auxiliary_client imports Fixes 6 failing tests: - test_zombie_process_cleanup.py: test_close_calls_cleanup_functions, test_close_survives_partial_failures - test_agent_cache.py: test_close_vs_release_full_teardown_difference - test_write_deny.py: test_hermes_env - test_approval_heartbeat.py: test_wait_returns_immediately_on_user_response, test_heartbeat_import_failure_does_not_break_wait
- Fix build_write_denied_paths to use home/.hermes/.env instead of get_hermes_home() - Fix mock paths in test_zombie_process_cleanup.py (use tools.process_registry.process_registry) Fixes 6 failing tests across 4 test files.
pytest's caplog fixture doesn't automatically capture warnings from hermes_cli.config logger because the logger's level is not set. This conftest.py sets the logger level to DEBUG for all tests in tests/hermes_cli/, ensuring caplog captures WARNING and above messages. Fixes 2 failing tests in test_provider_config_validation.py:
- test_minimax_provider: add _fallback_chain init in test stub - test_tips: shorten EN Tip 105 to 131 chars (was 160) - test_concurrent_interrupt: fix polling_tool sig + re-apply after _make_agent
…_breaker_threshold from 5 to 3 - Reduces _tool_failure_threshold from 5 to 3 - Improves circuit breaker error message with specific suggestions - Helps stop tool-loop failures earlier
When a tool call fails repeatedly, the compression model is asked for a suggestion. This gives the main model a fresh perspective to break out of tool loops. - Added _get_tool_suggestion() for retry threshold failures - Added _get_consecutive_suggestion() for identical-call circuit breaker - Suggestions are appended to the error message with a 💡 emoji - Gracefully falls back if compression model is unavailable
When a tool call fails repeatedly, suggestions are provided even without a compression model configured. - Added _heuristic_suggestion() — pattern matching on error messages - Added _heuristic_consecutive_suggestion() — 20+ tool-specific suggestions - Compression model is tried first, then falls back to heuristics - Works zero-config — no LLM needed for basic suggestions
Replaced complex pattern-matching heuristics with simple generic prompts. - Removed _heuristic_suggestion() (8 pattern rules) - Removed _heuristic_consecutive_suggestion() (20+ tool mappings) - Fallback is now just a direct text suggestion - Compression model still tries first for smarter suggestions - Net: -90 lines of dead pattern logic
Added README explaining: - This is a fork of NousResearch/hermes-agent - What Hermes Agent is - Custom modifications in this fork (circuit breaker improvements, test fixes)
The integration tests call write_file_tool which goes through _get_file_ops() that creates a terminal environment. Without explicit TERMINAL_ENV=local, the CI environment may default to modal which fails due to missing credentials. Fixes: test_net_new_file_no_warning, test_sibling_agent_write_surfaces_warning_through_handler
The test_terminal_and_file_toolsets_resolve_all_tools and
test_terminal_tool_present tests call get_tool_definitions() which
filters tools based on check_terminal_requirements(). Without
explicit TERMINAL_ENV=local, the CI environment's TERMINAL_ENV=modal
(without credentials) causes check_terminal_requirements() to return
False, dropping the terminal tool from the schema.
Fixes: test_terminal_and_file_toolsets_resolve_all_tools,
test_terminal_tool_present
Author
|
Apologies — this PR inadvertently includes i18n/Chinese localization commits that were not intended to be part of the circuit breaker submission. Those changes were mixed into my fork's main branch by mistake. I will open a clean PR with only the circuit breaker changes shortly. The i18n work is tracked separately in #13625. I'll submit it as its own PR if the maintainers are interested. Closing this one to keep things clean. |
Author
|
Update: The clean PR with only circuit breaker changes is now available at #14059. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add a circuit breaker mechanism to detect and break infinite tool-call loops in the agent framework.
Problem
When the LLM gets stuck in a retry loop calling the same tool with identical arguments repeatedly, it wastes tokens and never makes progress. This can happen with any tool (terminal, execute_code, etc.).
Solution
Added circuit breaker logic in both execution paths:
_invoke_tool()— concurrent execution path_execute_tool_calls_sequential()— sequential execution pathHow it works:
tool_name:md5(json_args)Default threshold: 5 consecutive identical calls (configurable via
_circuit_breaker_threshold)Testing
Changes
run_agent.py: Added_consecutive_tool_callsand_circuit_breaker_thresholdstaterun_agent.py: Added circuit breaker logic to_invoke_tool()and_execute_tool_calls_sequential()