[Bug]: _last_content_with_tools fallback bypasses empty-response retries, causing silent agent loop termination mid-task

## Bug Description

When an assistant turn contains **both text content and tool calls**, the content is captured in `_last_content_with_tools` (L8874). If the **next** model response after tool execution returns empty content with no tool calls, the fallback at L8997-9014 **immediately uses the captured content as the final response and breaks out of the loop** — bypassing the standard 3x empty-response retry mechanism (L9048-9058) and the 2x thinking-prefill retry (L9028-9041).

This causes the agent to **silently terminate mid-task** when the model happens to return empty content at an intermediate point. The user receives the captured text (which is often a progress update like "Task 4 done! Now starting task 5:") as the final response, with no indication that the agent loop has stopped.

## Reproduction Scenario

1. Give the agent a long multi-step task (e.g., 5 subtasks dispatched via tool calls)
2. After completing subtask 4, the model generates: `"Tests passed! Now dispatching task 5 (UI):"` + `[tool_call: terminal(read sidebar.tsx)]`
3. The content is saved to `_last_content_with_tools`
4. Tool results return successfully
5. The model is called again but returns **empty content, no tool calls** (transient, would succeed on retry)
6. Instead of retrying (as would happen without `_last_content_with_tools`), the fallback triggers immediately → loop exits → user receives "Tests passed! Now dispatching task 5:" as final response
7. Task 5 is never executed. No warning or error is emitted.

### Real-world impact

In a gateway (Discord) session, this caused a **5.5-hour silent stall**. The agent completed 4/5 subtasks in 19 minutes, then the fallback triggered at iteration 52. The user received a message saying "now dispatching task 5" and assumed work was in progress. The agent was actually idle. The user only discovered the stall 5.5 hours later.

## Root Cause

`run_agent.py` L8992-9014:

```python
if not self._has_content_after_think_block(final_response):
    fallback = getattr(self, '_last_content_with_tools', None)
    if fallback:
        logger.debug("Empty follow-up after tool calls — using prior turn content as final response")
        self._last_content_with_tools = None
        self._empty_content_retries = 0          # ← resets retry counter!
        # ... rewrites the original assistant msg content ...
        final_response = self._strip_think_blocks(fallback).strip()
        self._response_was_previewed = True
        break                                     # ← exits loop immediately
```

The fallback:
- **Bypasses** the 3x empty-response retry (L9048-9058) that would otherwise trigger
- **Resets** `_empty_content_retries` to 0, ensuring retries can never accumulate
- **Breaks immediately** — no chance for the model to recover with a retry
- Emits only a `logger.debug()` — invisible to users and gateway logs

The design intent ("model already said what it needed to say, e.g. 'You're welcome!' + memory save") is valid for **housekeeping tool calls**, but does not account for **substantive tool calls** (terminal, read_file, search_files, etc.) where the content is a progress update, not a final answer.

## Relationship to Existing Issues

- **#6067** — Same `_last_content_with_tools` mechanism, but describes the content-dropping problem on Telegram. This issue describes the more severe **loop termination** consequence.
- **#5732** / **#5736** — Empty output from Codex streams triggering fallback paths. Same class of "empty response → wrong fallback" bugs.
- **#3956** — The inverse problem: empty responses retried too aggressively. Our bug is that the `_last_content_with_tools` fallback **prevents** retries entirely.
- **#4815** — Gateway timeout kills long tasks. Our bug is different: the agent loop terminates itself *voluntarily* due to fallback logic, not timeout.
- **#5114** — Autoresearch skill with watchdog cron. Currently the only watchdog pattern in the codebase, but it is skill-level, not framework-level.

## Existing Retry Mechanisms (for context)

| Failure mode | Retry mechanism | Status |
|---|---|---|
| API call failure | 3x with exponential backoff | ✅ Works |
| Stream transport error | `HERMES_STREAM_RETRIES` (default 2) | ✅ Works |
| Empty response (no content, no reasoning) | 3x silent retry | ✅ Works |
| Thinking-only (reasoning but no text) | 2x prefill continuation | ✅ Works |
| 429 rate limit | retry-after + credential rotation | ✅ Works |
| **Empty response WITH `_last_content_with_tools`** | **0x retry, immediate exit** | ❌ **This bug** |

## Suggested Direction

The `_last_content_with_tools` fallback should not short-circuit the retry pipeline. A possible approach:

1. When `_last_content_with_tools` exists but the model returns empty, still attempt the standard 3x empty-response retries first
2. Only fall back to `_last_content_with_tools` after retries are exhausted
3. Consider distinguishing **housekeeping-only** tool-call turns (memory, todo, skill_manage — safe to use content as final response) from **substantive** tool-call turns (terminal, read_file, search_files, delegate_task — content is likely a progress update)
4. Note: the code at L8876-8882 already has `_HOUSEKEEPING_TOOLS` detection for muting output — this signal could be reused

Note: I am not prescribing a specific fix. The interaction between `_last_content_with_tools`, empty-response retries, thinking-prefill retries, and the Codex/Anthropic API modes is complex. The maintainers are best positioned to design the right solution.

## Environment

- hermes-agent gateway (Discord platform)
- Model: claude-opus-4.6-1m via custom provider
- 52 API calls / 19 minutes before silent termination
- `HERMES_MAX_ITERATIONS=90` (well within budget)
- No errors in gateway.log, errors.log, or gateway.error.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: _last_content_with_tools fallback bypasses empty-response retries, causing silent agent loop termination mid-task #7968

Bug Description

Reproduction Scenario

Real-world impact

Root Cause

Relationship to Existing Issues

Existing Retry Mechanisms (for context)

Suggested Direction

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Failure mode	Retry mechanism	Status
API call failure	3x with exponential backoff	✅ Works
Stream transport error	`HERMES_STREAM_RETRIES` (default 2)	✅ Works
Empty response (no content, no reasoning)	3x silent retry	✅ Works
Thinking-only (reasoning but no text)	2x prefill continuation	✅ Works
429 rate limit	retry-after + credential rotation	✅ Works
Empty response WITH `_last_content_with_tools`	0x retry, immediate exit	❌ This bug

[Bug]: _last_content_with_tools fallback bypasses empty-response retries, causing silent agent loop termination mid-task #7968

Description

Bug Description

Reproduction Scenario

Real-world impact

Root Cause

Relationship to Existing Issues

Existing Retry Mechanisms (for context)

Suggested Direction

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions