Date observed: 2026-04-09
Hermes version: v0.7.0 (2026.4.3)
Model in use: minimax-m2.7:cloud via http://localhost:11434/v1 (Ollama custom provider)
Platform where bug occurs: Telegram (gateway mode)
Not reproducible via: CLI (hermes chat -q "...") — works fine there
Symptoms
- Hermes replies with the literal text
(empty) on Telegram.
- Happens mid-conversation, not on the first message.
- Can happen multiple times in a row for the same message.
Root Cause
The Ollama cloud endpoint has a parallel request limit of 3 concurrent model calls. When multiple agents (Hermes + OpenClaw) are running simultaneously against the same endpoint, the API returns HTTP 200 with an empty choices[0].message.content — no tool calls, no reasoning.
When this happens, run_agent.py (around line 8896) substitutes the literal string "(empty)" as the response, which gets forwarded to the user on Telegram verbatim.
The existing retry logic (_try_activate_fallback) only triggers on HTTP-level failures (null response, missing choices), not on a successful HTTP 200 response where the model returned empty content.
Relevant code path (run_agent.py):
line ~8835: final_response = assistant_message.content or ""
line ~8840: if not self._has_content_after_think_block(final_response):
...
# After exhausting prefill retries:
line ~8896: assistant_msg["content"] = "(empty)"
line ~8905: final_response = "(empty)"
break # ← sent to user as-is
Proposed Fix
Part 1 — Exponential backoff retry (in run_agent.py, around line 9121)
After incrementing _empty_content_retries, retry with backoff before giving up:
if _truly_empty and not _has_structured and self._empty_content_retries < 3:
self._empty_content_retries += 1
backoff = min(5 * (2 ** (self._empty_content_retries - 1)), 20) # 2s, 4s, 8s, max 20s
self._vprint(
f"{self.log_prefix}↻ Empty response — retrying ({self._empty_content_retries}/3) "
f"after {backoff}s backoff..."
)
time.sleep(backoff)
continue
Part 2 — User-friendly final message
If all retries are exhausted, replace (empty) with a user-friendly message:
# Instead of: final_response = "(empty)"
# Use:
final_response = "⚠️ The model returned an empty response. Please try again in a moment."
Part 3 — Status callback (optional enhancement)
On the first empty retry, signal the gateway so the user sees a typing indicator:
if self._empty_content_retries == 1 and self.status_callback:
try:
self.status_callback("typing") # or ("warning", "Model overloaded, retrying...")
except Exception:
pass
Testing
To reproduce: Run both Hermes (gateway) and OpenClaw simultaneously with minimax-m2.7:cloud against the same Ollama endpoint that has a 3-request parallel limit. Any message after both are active will likely trigger the empty response.
Proposed test cases (new file tests/test_empty_response.py):
- Mock API client to return empty content once, then valid response → verify 1 retry with backoff succeeds
- Mock API client to return empty content 3 times, then valid response → verify 3 retries then graceful failure
- Mock API client to return reasoning-only (no visible content) → verify falls back to reasoning text
- Verify: after 3 exhausted retries, response is user-friendly message NOT literal
"(empty)"
Workaround (current)
- Avoid running OpenClaw and Hermes simultaneously to stay under the 3-parallel-request limit.
- Configure a
fallback_model in ~/.hermes/config.yaml pointing to a second model/provider that can absorb overflow.
Evidence
- Session file:
~/.hermes/sessions/session_20260409_074150_69a8f1d3.json — multiple assistant messages with "content": "(empty)" at messages 56, 58, 111, 113, 115, 142, 144.
- Agent log entries at time of occurrence (
~/.hermes/logs/agent.log):
10:13:28 response ready: platform=telegram chat=11****9 time=7.3s api_calls=1 response=7 chars
(7 chars = "(empty)")
Date observed: 2026-04-09
Hermes version: v0.7.0 (2026.4.3)
Model in use:
minimax-m2.7:cloudviahttp://localhost:11434/v1(Ollama custom provider)Platform where bug occurs: Telegram (gateway mode)
Not reproducible via: CLI (
hermes chat -q "...") — works fine thereSymptoms
(empty)on Telegram.Root Cause
The Ollama cloud endpoint has a parallel request limit of 3 concurrent model calls. When multiple agents (Hermes + OpenClaw) are running simultaneously against the same endpoint, the API returns HTTP 200 with an empty
choices[0].message.content— no tool calls, no reasoning.When this happens,
run_agent.py(around line 8896) substitutes the literal string"(empty)"as the response, which gets forwarded to the user on Telegram verbatim.The existing retry logic (
_try_activate_fallback) only triggers on HTTP-level failures (null response, missingchoices), not on a successful HTTP 200 response where the model returned empty content.Relevant code path (
run_agent.py):Proposed Fix
Part 1 — Exponential backoff retry (in
run_agent.py, around line 9121)After incrementing
_empty_content_retries, retry with backoff before giving up:Part 2 — User-friendly final message
If all retries are exhausted, replace
(empty)with a user-friendly message:Part 3 — Status callback (optional enhancement)
On the first empty retry, signal the gateway so the user sees a typing indicator:
Testing
To reproduce: Run both Hermes (gateway) and OpenClaw simultaneously with
minimax-m2.7:cloudagainst the same Ollama endpoint that has a 3-request parallel limit. Any message after both are active will likely trigger the empty response.Proposed test cases (new file
tests/test_empty_response.py):"(empty)"Workaround (current)
fallback_modelin~/.hermes/config.yamlpointing to a second model/provider that can absorb overflow.Evidence
~/.hermes/sessions/session_20260409_074150_69a8f1d3.json— multiple assistant messages with"content": "(empty)"at messages 56, 58, 111, 113, 115, 142, 144.~/.hermes/logs/agent.log):