Skip to content

Hermes-4 does not invoke tools via Telegram gateway despite tools being loaded #2936

@0xarkstar

Description

@0xarkstar

Summary

Hermes-4-405B (and 70B) loaded via Telegram gateway has 30 tools available in the session but never invokes any of them. Instead, the model generates hallucinated tool outputs as natural language text. Direct API calls to the same endpoint with the same model and tools parameter work correctly.

Environment

  • Hermes Agent: v0.4.0 (2026.3.23) — fresh install (not migrated)
  • Model: Hermes-4-405B via inference-api.nousresearch.com/v1
  • Platform: Telegram gateway (systemd user service)
  • Server: OCI ARM (aarch64), Ubuntu 22.04, 24GB RAM
  • SOUL.md: Minimal (official guide structure — Identity/Style/Avoid/Defaults, ~30 lines)
  • Config: toolsets: [all], reasoning_effort: high

Evidence

1. Tools are loaded correctly

Session JSON shows 30 tools including cronjob, terminal, read_file, etc.:

Tools loaded: 30
  - cronjob, terminal, read_file, write_file, search_files, execute_code,
    delegate_task, browser_*, web_search, web_extract, memory, todo, ...

2. Direct API call — tool-calling works

curl -X POST "https://inference-api.nousresearch.com/v1/chat/completions" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "model": "Hermes-4-405B",
    "messages": [
      {"role": "system", "content": "You are on Telegram. You are ЯATƧKЯA, a sharp AI assistant. Korean default."},
      {"role": "user", "content": "매일 아침 9시에 URL 체크하는 cron job 만들어줘"}
    ],
    "tools": [{"type": "function", "function": {"name": "cronjob", ...}}],
    "max_tokens": 200
  }'

Result: tool_calls: [cronjob(create, "0 9 * * *")], finish_reason: tool_calls

3. Via Telegram gateway — zero tool calls

Same model, same API endpoint, same tools, but through the gateway:

Total messages in session: 14
Messages with tool_calls: 0

The model generates fake outputs like fabricated JSON, fake job IDs, and hallucinated cron lists instead of actually calling the cronjob tool.

4. Cron scheduler — tools work fine

When the same model runs via the cron scheduler (not Telegram), it successfully calls read_file, search_files, and execute_code.

Root Cause Analysis

The gateway embeds tool schemas into the system prompt as XML text (run_agent.py:1618-1629):

system_msg = (
    "You are a function calling AI model. You are provided with function signatures "
    "within <tools> </tools> XML tags..."
    f"<tools>\n{self._format_tools_for_system_message()}\n</tools>\n"
    "Each function call should be enclosed within <tool_call> </tool_call> XML tags."
)

With 30 tools, this XML block becomes very large. Combined with SOUL.md, AGENTS.md, platform context, memory, and skills guidance, the system prompt likely exceeds the point where the model reliably generates <tool_call> XML responses.

The model falls back to generating natural language that describes tool usage rather than invoking tools. This manifests as convincing but completely fabricated tool outputs.

Suspected Cause

The XML-based tool-calling approach in the system prompt is overwhelmed when:

  • 30 tool schemas are embedded as XML
  • SOUL.md personality is loaded
  • Telegram platform context is added
  • Memory/user context is injected
  • Skills guidance is present

In contrast, direct API calls with the native OpenAI tools parameter work perfectly because the model processes tool schemas through its trained function-calling pathway, not through system prompt parsing.

Steps to Reproduce

  1. Fresh install Hermes Agent v0.4.0
  2. Configure Telegram gateway with Hermes-4-405B (or 70B)
  3. Set minimal SOUL.md
  4. Send any message requesting tool use (e.g., "cron job 만들어줘")
  5. Observe: model responds with text, no actual tool invocation
  6. Verify in session JSON: tool_calls count = 0

Expected Behavior

Model should invoke the cronjob tool (or any requested tool) via <tool_call> XML or native function calling, not generate fabricated text responses.

Workaround

  • Use CLI for tool-dependent operations (hermes cron create ...)
  • Cron scheduler executes tools correctly
  • Telegram bot works for general conversation (no tools needed)

Additional Notes

  • Tested with both Hermes-4-70B and Hermes-4-405B — same behavior
  • /cron slash command is cli_only=True, so Telegram users must rely on natural language tool invocation which fails due to this issue
  • TELEGRAM_HOME_CHANNEL env var is also undocumented but required for CLI-created cron delivery (separate minor issue)

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/agentCore agent loop, run_agent.py, prompt buildercomp/gatewayGateway runner, session dispatch, deliveryplatform/telegramTelegram bot adapterprovider/nousNous Research API (OAuth)type/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions