Bug Description
ROOT CAUSE ANALYSIS: Indefinite Stuck Behavior (April 9-10, 2026)
EXECUTIVE SUMMARY
Two distinct stuck patterns were identified across 4 sessions. Both trace back to the same root cause: the GLM model family (specifically GLM-4.5-Air via Z.AI API) returning empty API responses with no content and no tool calls, causing Hermes' retry loop to exhaust silently and emit "(empty)" responses that appear to the user as a hung/frozen session.
AFFECTED SESSIONS
| Session |
Model |
Empty Responses |
Nature |
| 20260409_184803 |
GLM-4.5-Air |
42 |
Kanboard research + install. Terminal stuck at end. |
| 20260410_004640 |
GLM-4.5-Air |
37 |
Kanboard Docker install. Worked through it. |
| 20260409_152415 |
glm-5.1 |
19 |
Dev team playbook design. Mostly tool-only (normal). |
| 20260410_021022 |
glm-5 |
9 |
WORK_ITEMS.md migration. Minor stuck spots. |
ROOT CAUSE #1: GLM-4.5-Air Empty API Responses (CRITICAL)
What happened: The GLM-4.5-Air model via Z.AI API consistently returned responses with NO text content and NO tool calls. This happened 42 times in one session and 37 times in another.
The mechanism:
- Model returns an API response with
content="" and no tool_calls
- Hermes detects this as "truly empty" and retries up to 3 times (see
run_agent.py line ~9336)
- After 3 retries, it falls through to the
"(empty)" terminal path (line ~9348)
- The
(empty) string is written as the assistant message
- The conversation loop continues, but the model keeps returning empty responses
- From the user's perspective, the agent is "stuck" -- nothing is happening, no output is appearing
Why it looks like a hang: The retry loop runs silently (no output to user). When retries exhaust, the "(empty)" message produces no visible output. The loop continues requesting more from the API, which keeps returning empty, creating an infinite-looking stuck state.
Evidence from the Kanboard install session (184803):
[92] tool: (50K unzip output)
[93] user: "What's the status"
[94] assistant: "(empty)" <-- STUCK: model returned nothing
[95] user: "Is postgresql installed?"
[96] assistant: "(empty)" <-- STUCK AGAIN: model returned nothing again
The session had to be Ctrl+C'd at this point.
ROOT CAUSE #2: Token Expiration (401 Auth Errors) Breaking Sessions
What happened: Multiple sessions hit 401 - "token expired or incorrect" errors from the Z.AI API.
Evidence from request dumps:
session_20260409_190617: 401 "token expired or incorrect" (GLM-4.5-Air)
session_20260409_231931: 401 "token expired or incorrect" (glm-5.1)
session_20260410_003732: 401 "token expired or incorrect" (GLM-4.5-Air)
session_20260410_005455: 401 "token expired or incorrect" (GLM-4.5-Air)
This caused entire sessions to terminate and need to be restarted (that's why there are duplicate sessions like 152415 and 231931 which are identical conversations -- you had to restart).
ROOT CAUSE #3: Benign Empty Responses (NOT a bug)
Many of the "empty" assistant messages in the dev-team session (152415) are actually tool-call-only responses -- the model returned a tool call with no explanatory text. This is NORMAL LLM behavior and was not a stuck issue. The session logging stores these as content="" but they have associated tool calls.
CONTRIBUTING FACTORS
-
Large tool outputs as context poison: The session that died (184803) had just received a 50K-character unzip output (message 92) right before getting stuck. Large, noisy tool outputs in the context may have caused the model to fail to generate meaningful responses.
-
Long context windows: Session 184803 had 97 messages (42 empty). The growing context likely degraded GLM-4.5-Air's ability to respond.
-
No fallback activation: When GLM-4.5-Air kept returning empty, Hermes did not failover to a different model/provider. The empty-response retry path does not trigger fallback logic (only rate-limit and invalid-response paths do).
RECOMMENDATIONS
-
Empty response should trigger fallback -- After exhausting empty retries, Hermes should attempt _try_activate_fallback() before giving up with "(empty)". Currently, only rate-limit and invalid-response paths trigger fallback.
-
User-visible feedback during retries -- The retry loop for empty content runs silently. Adding a spinner or status indicator would prevent the "stuck" perception.
-
Auto-compact on repeated empties -- If the model returns empty 3+ times in sequence, trigger context compression before retrying. The large context may be the cause.
-
Token refresh handling -- The 401 auth errors suggest Z.AI tokens expire mid-session. Hermes should detect 401s and refresh the API key automatically (if the provider supports token refresh) rather than requiring a manual session restart.
-
Model-specific handling for GLM-4.5-Air -- This model has a significantly higher empty-response rate than glm-5 or glm-5.1. Consider treating GLM-4.5-Air as a less reliable model and adding extra retry or fallback logic specifically for it
Steps to Reproduce
Hard to reproduce.
Expected Behavior
When the agent gets stuck in a loop wherein the LLM response is empty on a continued basis, there must be a configurable time-bound fallback. Given the latest PR release of async wake-up, there must be more graceful ways of handling this 'stuck' situation.
Actual Behavior
The Agent seems to be 'stuck'. Nothing happens on screen. The last visible text is the mid-way execution trace frozen in time without task completion.
Affected Component
Agent Core (conversation loop, context compression, memory)
Messaging Platform (if gateway-related)
No response
Operating System
Ubuntu 24.04
Python Version
3.14.2
Hermes Version
v0.7.0 (2026.4.3)
Relevant Logs / Traceback
## ROOT CAUSE ANALYSIS: Indefinite Stuck Behavior (April 9-10, 2026)
### EXECUTIVE SUMMARY
Two distinct stuck patterns were identified across 4 sessions. Both trace back to the same root cause: **the GLM model family (specifically GLM-4.5-Air via Z.AI API) returning empty API responses** with no content and no tool calls, causing Hermes' retry loop to exhaust silently and emit "(empty)" responses that appear to the user as a hung/frozen session.
---
### AFFECTED SESSIONS
| Session | Model | Empty Responses | Nature |
|---------|-------|----------------|--------|
| 20260409_184803 | GLM-4.5-Air | 42 | Kanboard research + install. Terminal stuck at end. |
| 20260410_004640 | GLM-4.5-Air | 37 | Kanboard Docker install. Worked through it. |
| 20260409_152415 | glm-5.1 | 19 | Dev team playbook design. Mostly tool-only (normal). |
| 20260410_021022 | glm-5 | 9 | WORK_ITEMS.md migration. Minor stuck spots. |
---
### ROOT CAUSE #1: GLM-4.5-Air Empty API Responses (CRITICAL)
**What happened:** The GLM-4.5-Air model via Z.AI API consistently returned responses with NO text content and NO tool calls. This happened 42 times in one session and 37 times in another.
**The mechanism:**
1. Model returns an API response with `content=""` and no `tool_calls`
2. Hermes detects this as "truly empty" and retries up to 3 times (see `run_agent.py` line ~9336)
3. After 3 retries, it falls through to the `"(empty)"` terminal path (line ~9348)
4. The `(empty)` string is written as the assistant message
5. The conversation loop continues, but the model keeps returning empty responses
6. From the user's perspective, the agent is "stuck" -- nothing is happening, no output is appearing
**Why it looks like a hang:** The retry loop runs silently (no output to user). When retries exhaust, the "(empty)" message produces no visible output. The loop continues requesting more from the API, which keeps returning empty, creating an infinite-looking stuck state.
**Evidence from the Kanboard install session (184803):**
[92] tool: (50K unzip output)
[93] user: "What's the status"
[94] assistant: "(empty)" <-- STUCK: model returned nothing
[95] user: "Is postgresql installed?"
[96] assistant: "(empty)" <-- STUCK AGAIN: model returned nothing again
The session had to be Ctrl+C'd at this point.
---
### ROOT CAUSE #2: Token Expiration (401 Auth Errors) Breaking Sessions
**What happened:** Multiple sessions hit `401 - "token expired or incorrect"` errors from the Z.AI API.
**Evidence from request dumps:**
session_20260409_190617: 401 "token expired or incorrect" (GLM-4.5-Air)
session_20260409_231931: 401 "token expired or incorrect" (glm-5.1)
session_20260410_003732: 401 "token expired or incorrect" (GLM-4.5-Air)
session_20260410_005455: 401 "token expired or incorrect" (GLM-4.5-Air)
This caused entire sessions to terminate and need to be restarted (that's why there are duplicate sessions like `152415` and `231931` which are identical conversations -- you had to restart).
---
### ROOT CAUSE #3: Benign Empty Responses (NOT a bug)
Many of the "empty" assistant messages in the dev-team session (152415) are actually **tool-call-only responses** -- the model returned a tool call with no explanatory text. This is NORMAL LLM behavior and was not a stuck issue. The session logging stores these as `content=""` but they have associated tool calls.
---
### CONTRIBUTING FACTORS
1. **Large tool outputs as context poison:** The session that died (184803) had just received a 50K-character unzip output (message 92) right before getting stuck. Large, noisy tool outputs in the context may have caused the model to fail to generate meaningful responses.
2. **Long context windows:** Session 184803 had 97 messages (42 empty). The growing context likely degraded GLM-4.5-Air's ability to respond.
3. **No fallback activation:** When GLM-4.5-Air kept returning empty, Hermes did not failover to a different model/provider. The empty-response retry path does not trigger fallback logic (only rate-limit and invalid-response paths do).
---
### RECOMMENDATIONS
1. **Empty response should trigger fallback** -- After exhausting empty retries, Hermes should attempt `_try_activate_fallback()` before giving up with "(empty)". Currently, only rate-limit and invalid-response paths trigger fallback.
2. **User-visible feedback during retries** -- The retry loop for empty content runs silently. Adding a spinner or status indicator would prevent the "stuck" perception.
3. **Auto-compact on repeated empties** -- If the model returns empty 3+ times in sequence, trigger context compression before retrying. The large context may be the cause.
4. **Token refresh handling** -- The 401 auth errors suggest Z.AI tokens expire mid-session. Hermes should detect 401s and refresh the API key automatically (if the provider supports token refresh) rather than requiring a manual session restart.
5. **Model-specific handling for GLM-4.5-Air** -- This model has a significantly higher empty-response rate than glm-5 or glm-5.1. Consider treating GLM-4.5-Air as a less reliable model and adding extra retry or fallback logic specifically for it
Root Cause Analysis (optional)
No response
Proposed Fix (optional)
No response
Are you willing to submit a PR for this?
Bug Description
ROOT CAUSE ANALYSIS: Indefinite Stuck Behavior (April 9-10, 2026)
EXECUTIVE SUMMARY
Two distinct stuck patterns were identified across 4 sessions. Both trace back to the same root cause: the GLM model family (specifically GLM-4.5-Air via Z.AI API) returning empty API responses with no content and no tool calls, causing Hermes' retry loop to exhaust silently and emit "(empty)" responses that appear to the user as a hung/frozen session.
AFFECTED SESSIONS
ROOT CAUSE #1: GLM-4.5-Air Empty API Responses (CRITICAL)
What happened: The GLM-4.5-Air model via Z.AI API consistently returned responses with NO text content and NO tool calls. This happened 42 times in one session and 37 times in another.
The mechanism:
content=""and notool_callsrun_agent.pyline ~9336)"(empty)"terminal path (line ~9348)(empty)string is written as the assistant messageWhy it looks like a hang: The retry loop runs silently (no output to user). When retries exhaust, the "(empty)" message produces no visible output. The loop continues requesting more from the API, which keeps returning empty, creating an infinite-looking stuck state.
Evidence from the Kanboard install session (184803):
The session had to be Ctrl+C'd at this point.
ROOT CAUSE #2: Token Expiration (401 Auth Errors) Breaking Sessions
What happened: Multiple sessions hit
401 - "token expired or incorrect"errors from the Z.AI API.Evidence from request dumps:
This caused entire sessions to terminate and need to be restarted (that's why there are duplicate sessions like
152415and231931which are identical conversations -- you had to restart).ROOT CAUSE #3: Benign Empty Responses (NOT a bug)
Many of the "empty" assistant messages in the dev-team session (152415) are actually tool-call-only responses -- the model returned a tool call with no explanatory text. This is NORMAL LLM behavior and was not a stuck issue. The session logging stores these as
content=""but they have associated tool calls.CONTRIBUTING FACTORS
Large tool outputs as context poison: The session that died (184803) had just received a 50K-character unzip output (message 92) right before getting stuck. Large, noisy tool outputs in the context may have caused the model to fail to generate meaningful responses.
Long context windows: Session 184803 had 97 messages (42 empty). The growing context likely degraded GLM-4.5-Air's ability to respond.
No fallback activation: When GLM-4.5-Air kept returning empty, Hermes did not failover to a different model/provider. The empty-response retry path does not trigger fallback logic (only rate-limit and invalid-response paths do).
RECOMMENDATIONS
Empty response should trigger fallback -- After exhausting empty retries, Hermes should attempt
_try_activate_fallback()before giving up with "(empty)". Currently, only rate-limit and invalid-response paths trigger fallback.User-visible feedback during retries -- The retry loop for empty content runs silently. Adding a spinner or status indicator would prevent the "stuck" perception.
Auto-compact on repeated empties -- If the model returns empty 3+ times in sequence, trigger context compression before retrying. The large context may be the cause.
Token refresh handling -- The 401 auth errors suggest Z.AI tokens expire mid-session. Hermes should detect 401s and refresh the API key automatically (if the provider supports token refresh) rather than requiring a manual session restart.
Model-specific handling for GLM-4.5-Air -- This model has a significantly higher empty-response rate than glm-5 or glm-5.1. Consider treating GLM-4.5-Air as a less reliable model and adding extra retry or fallback logic specifically for it
Steps to Reproduce
Hard to reproduce.
Expected Behavior
When the agent gets stuck in a loop wherein the LLM response is empty on a continued basis, there must be a configurable time-bound fallback. Given the latest PR release of async wake-up, there must be more graceful ways of handling this 'stuck' situation.
Actual Behavior
The Agent seems to be 'stuck'. Nothing happens on screen. The last visible text is the mid-way execution trace frozen in time without task completion.
Affected Component
Agent Core (conversation loop, context compression, memory)
Messaging Platform (if gateway-related)
No response
Operating System
Ubuntu 24.04
Python Version
3.14.2
Hermes Version
v0.7.0 (2026.4.3)
Relevant Logs / Traceback
Root Cause Analysis (optional)
No response
Proposed Fix (optional)
No response
Are you willing to submit a PR for this?