Bug: Empty response after tool calls silently abandons incomplete multi-step tasks

## Description

When executing multi-step tasks (e.g., fetch a webpage → extract content → save to file), Hermes-Agent sometimes silently drops incomplete tasks after a tool call returns. The status line shows:

> ↻ Empty response after tool calls — using earlier content as final answer

The remaining steps are never executed. This is especially frequent when using non-Claude models (e.g., GLM-5), but the root cause is in the agent loop design, not the model.

## Reproduction

1. Give Hermes a 3+ step task that requires sequential tool calls, e.g., "fetch this URL, extract the article, and save it to a file"
2. The first 2 tool calls execute successfully
3. On the 3rd turn, the model returns an empty response (no text, no tool_calls)
4. Hermes falls back to using earlier content as the final answer and **breaks out of the loop**
5. Steps 3+ are never executed

## Root Cause Analysis

The agent loop in `run_agent.py` (lines ~10141-10164) handles empty responses with a fallback chain:

```
Empty response detected
  ├─ Partial stream content? → use it, break
  ├─ _last_content_with_tools exists? → reuse it, break    ← this path fires
  ├─ Thinking-only content? → continue loop
  └─ Nothing? → retry (max 3)
```

When the model emitted text alongside a tool call in a previous turn (e.g., "OK, fetching the page" + `browser_navigate`), that text is stored in `_last_content_with_tools`. On a subsequent empty response, the fallback reuses this old text as the "final answer" and **exits the conversation loop entirely**.

The critical issue: **the fallback was designed for graceful degradation ("at least give the user something"), but it causes a worse outcome — silently abandoning incomplete tasks.**

Once `run_conversation()` returns, the todo list, skill instructions, and pending steps are all lost. There is no mechanism to detect that tasks remain unfinished.

### Why Claude Code handles this correctly

Claude Code faces the same empty-response problem but avoids task loss through **cross-session task persistence**:

1. **Immutable JSON task manifest** — A read-only task checklist is generated at the start. The execution agent can only flip `status` fields, never delete or rewrite tasks.
2. **Forced 3-step wake-up ritual** — Every new session (including those restarted after empty responses) runs `pwd` → `git log` → `read progress.txt` before doing anything else.
3. **Context Reset** — Rather than compressing overflowing context, Claude Code wipes it entirely and boots a fresh agent with a structured handoff file.

| Dimension | Hermes Agent | Claude Code |
|-----------|-------------|-------------|
| Task state storage | Volatile (in-message todo list) | Persistent (JSON + progress.txt on disk) |
| After empty response | Fallback → break → loop exits | New session reads progress file → resumes |
| State tamper resistance | Model can forget/skip tasks | JSON "physical lock" — model only changes status |
| Recovery granularity | Entire conversation lost | Per-step precise recovery |

## Suggested Solutions

### Option A (Source-level fix): Detect pending tasks before exiting

Modify the fallback logic to check for unfinished tasks before breaking out of the loop. If pending work exists, inject a continuation prompt instead of exiting:

```python
# Pseudocode
if fallback and has_pending_todos(messages):
    messages.append({"role": "user", "content": "Please continue with the remaining steps."})
    continue  # stay in the loop
else:
    break
```

This is the most robust solution. The empty response retry counter would prevent infinite loops.

### Option B: Persistent task state file

Add an optional mechanism to write task state to disk (similar to Claude Code's progress.txt). If the loop exits with pending tasks, the next user turn can detect and resume automatically.

### Option C: Configurable fallback behavior

Add a config option like `fallback_on_empty: "continue" | "exit"` so users can choose whether empty responses should retry with a prompt or exit gracefully.

## Environment

- Hermes Agent version: latest (from repo)
- Model: GLM-5 (via custom provider), but the issue affects any model prone to empty responses
- Platform: CLI + WebUI

## Related

This aligns with the Harness Engineering insight from Anthropic's "Effective harnesses for long-running agents" — every harness component encodes an assumption about what the model cannot do. The current fallback assumes "empty response = task complete," which is frequently incorrect for non-Claude models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Empty response after tool calls silently abandons incomplete multi-step tasks #9400

Description

Reproduction

Root Cause Analysis

Why Claude Code handles this correctly

Suggested Solutions

Option A (Source-level fix): Detect pending tasks before exiting

Option B: Persistent task state file

Option C: Configurable fallback behavior

Environment

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Dimension	Hermes Agent	Claude Code
Task state storage	Volatile (in-message todo list)	Persistent (JSON + progress.txt on disk)
After empty response	Fallback → break → loop exits	New session reads progress file → resumes
State tamper resistance	Model can forget/skip tasks	JSON "physical lock" — model only changes status
Recovery granularity	Entire conversation lost	Per-step precise recovery

Bug: Empty response after tool calls silently abandons incomplete multi-step tasks #9400

Description

Description

Reproduction

Root Cause Analysis

Why Claude Code handles this correctly

Suggested Solutions

Option A (Source-level fix): Detect pending tasks before exiting

Option B: Persistent task state file

Option C: Configurable fallback behavior

Environment

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions