Summary
Hermes-4-405B (and 70B) loaded via Telegram gateway has 30 tools available in the session but never invokes any of them. Instead, the model generates hallucinated tool outputs as natural language text. Direct API calls to the same endpoint with the same model and tools parameter work correctly.
Environment
- Hermes Agent: v0.4.0 (2026.3.23) — fresh install (not migrated)
- Model: Hermes-4-405B via
inference-api.nousresearch.com/v1
- Platform: Telegram gateway (systemd user service)
- Server: OCI ARM (aarch64), Ubuntu 22.04, 24GB RAM
- SOUL.md: Minimal (official guide structure — Identity/Style/Avoid/Defaults, ~30 lines)
- Config:
toolsets: [all], reasoning_effort: high
Evidence
1. Tools are loaded correctly
Session JSON shows 30 tools including cronjob, terminal, read_file, etc.:
Tools loaded: 30
- cronjob, terminal, read_file, write_file, search_files, execute_code,
delegate_task, browser_*, web_search, web_extract, memory, todo, ...
2. Direct API call — tool-calling works
curl -X POST "https://inference-api.nousresearch.com/v1/chat/completions" \
-H "Authorization: Bearer $API_KEY" \
-d '{
"model": "Hermes-4-405B",
"messages": [
{"role": "system", "content": "You are on Telegram. You are ЯATƧKЯA, a sharp AI assistant. Korean default."},
{"role": "user", "content": "매일 아침 9시에 URL 체크하는 cron job 만들어줘"}
],
"tools": [{"type": "function", "function": {"name": "cronjob", ...}}],
"max_tokens": 200
}'
Result: tool_calls: [cronjob(create, "0 9 * * *")], finish_reason: tool_calls ✓
3. Via Telegram gateway — zero tool calls
Same model, same API endpoint, same tools, but through the gateway:
Total messages in session: 14
Messages with tool_calls: 0
The model generates fake outputs like fabricated JSON, fake job IDs, and hallucinated cron lists instead of actually calling the cronjob tool.
4. Cron scheduler — tools work fine
When the same model runs via the cron scheduler (not Telegram), it successfully calls read_file, search_files, and execute_code.
Root Cause Analysis
The gateway embeds tool schemas into the system prompt as XML text (run_agent.py:1618-1629):
system_msg = (
"You are a function calling AI model. You are provided with function signatures "
"within <tools> </tools> XML tags..."
f"<tools>\n{self._format_tools_for_system_message()}\n</tools>\n"
"Each function call should be enclosed within <tool_call> </tool_call> XML tags."
)
With 30 tools, this XML block becomes very large. Combined with SOUL.md, AGENTS.md, platform context, memory, and skills guidance, the system prompt likely exceeds the point where the model reliably generates <tool_call> XML responses.
The model falls back to generating natural language that describes tool usage rather than invoking tools. This manifests as convincing but completely fabricated tool outputs.
Suspected Cause
The XML-based tool-calling approach in the system prompt is overwhelmed when:
- 30 tool schemas are embedded as XML
- SOUL.md personality is loaded
- Telegram platform context is added
- Memory/user context is injected
- Skills guidance is present
In contrast, direct API calls with the native OpenAI tools parameter work perfectly because the model processes tool schemas through its trained function-calling pathway, not through system prompt parsing.
Steps to Reproduce
- Fresh install Hermes Agent v0.4.0
- Configure Telegram gateway with Hermes-4-405B (or 70B)
- Set minimal SOUL.md
- Send any message requesting tool use (e.g., "cron job 만들어줘")
- Observe: model responds with text, no actual tool invocation
- Verify in session JSON:
tool_calls count = 0
Expected Behavior
Model should invoke the cronjob tool (or any requested tool) via <tool_call> XML or native function calling, not generate fabricated text responses.
Workaround
- Use CLI for tool-dependent operations (
hermes cron create ...)
- Cron scheduler executes tools correctly
- Telegram bot works for general conversation (no tools needed)
Additional Notes
- Tested with both Hermes-4-70B and Hermes-4-405B — same behavior
/cron slash command is cli_only=True, so Telegram users must rely on natural language tool invocation which fails due to this issue
TELEGRAM_HOME_CHANNEL env var is also undocumented but required for CLI-created cron delivery (separate minor issue)
Summary
Hermes-4-405B (and 70B) loaded via Telegram gateway has 30 tools available in the session but never invokes any of them. Instead, the model generates hallucinated tool outputs as natural language text. Direct API calls to the same endpoint with the same model and tools parameter work correctly.
Environment
inference-api.nousresearch.com/v1toolsets: [all],reasoning_effort: highEvidence
1. Tools are loaded correctly
Session JSON shows 30 tools including
cronjob,terminal,read_file, etc.:2. Direct API call — tool-calling works
Result:
tool_calls: [cronjob(create, "0 9 * * *")],finish_reason: tool_calls✓3. Via Telegram gateway — zero tool calls
Same model, same API endpoint, same tools, but through the gateway:
The model generates fake outputs like fabricated JSON, fake job IDs, and hallucinated cron lists instead of actually calling the
cronjobtool.4. Cron scheduler — tools work fine
When the same model runs via the cron scheduler (not Telegram), it successfully calls
read_file,search_files, andexecute_code.Root Cause Analysis
The gateway embeds tool schemas into the system prompt as XML text (
run_agent.py:1618-1629):With 30 tools, this XML block becomes very large. Combined with SOUL.md, AGENTS.md, platform context, memory, and skills guidance, the system prompt likely exceeds the point where the model reliably generates
<tool_call>XML responses.The model falls back to generating natural language that describes tool usage rather than invoking tools. This manifests as convincing but completely fabricated tool outputs.
Suspected Cause
The XML-based tool-calling approach in the system prompt is overwhelmed when:
In contrast, direct API calls with the native OpenAI
toolsparameter work perfectly because the model processes tool schemas through its trained function-calling pathway, not through system prompt parsing.Steps to Reproduce
tool_callscount = 0Expected Behavior
Model should invoke the
cronjobtool (or any requested tool) via<tool_call>XML or native function calling, not generate fabricated text responses.Workaround
hermes cron create ...)Additional Notes
/cronslash command iscli_only=True, so Telegram users must rely on natural language tool invocation which fails due to this issueTELEGRAM_HOME_CHANNELenv var is also undocumented but required for CLI-created cron delivery (separate minor issue)