Describe the bug
When using Qwen or DeepSeek models (e.g., qwen-plus, qwen-max), the agent exhibits severe degradation in tool usage, often failing to execute tools entirely, hallucinating execution ("I'm running the command..." but no tool is called), or ignoring existing code/memory.
Root Cause Analysis
I traced this back to agent/prompt_builder.py line ~271:
TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex", "gemini", "gemma", "grok", "glm")
This list is missing qwen and deepseek.
When tool_use_enforcement is set to auto (default), the system checks if the model name matches this tuple. Since qwen3.6-plus does not match, the critical tool-use enforcement system prompt is never injected.
Without this strict enforcement, strong non-OpenAI models tend to default to "chatty" behavior:
- They describe intended actions instead of calling the tools.
- They ignore existing project context/memory (hallucinating new code).
- They silently fail or stop mid-execution.
Proposed Fix
- Immediate Fix: Add
qwen, deepseek, and potentially mistral/llama to the TOOL_USE_ENFORCEMENT_MODELS tuple.
- Robust Fix: Change the default behavior of
tool_use_enforcement to true for all models, rather than relying on a hardcoded allowlist.
Environment
- Hermes Agent: v0.13.0
- Model: qwen3.6-plus (Alibaba Cloud)
- Symptom: "Lazy" tool usage, hallucination of execution, ignoring existing code.
Describe the bug
When using Qwen or DeepSeek models (e.g.,
qwen-plus,qwen-max), the agent exhibits severe degradation in tool usage, often failing to execute tools entirely, hallucinating execution ("I'm running the command..." but no tool is called), or ignoring existing code/memory.Root Cause Analysis
I traced this back to
agent/prompt_builder.pyline ~271:This list is missing
qwenanddeepseek.When
tool_use_enforcementis set toauto(default), the system checks if the model name matches this tuple. Sinceqwen3.6-plusdoes not match, the critical tool-use enforcement system prompt is never injected.Without this strict enforcement, strong non-OpenAI models tend to default to "chatty" behavior:
Proposed Fix
qwen,deepseek, and potentiallymistral/llamato theTOOL_USE_ENFORCEMENT_MODELStuple.tool_use_enforcementtotruefor all models, rather than relying on a hardcoded allowlist.Environment