conversation_loop empty-content gate ignores tool_calls — recovery loop on codestral content_len=0 + tool_calls_count>=1 responses

## Symptom

Per gist [sandbox-finding.md](https://gist.github.com/PowerCreek/f5706eeff6fc140ccf5519d0d6d2115d): sandbox dispatch on canonical (post-v0.18.6) loops between codestral and mistral-large with the synthetic "Your previous response was empty" recovery prompt — even though codestral emits `tool_calls_count=1`. v0.18.6's tool-execution dispatch diagnostics (`[tool-dispatch]` etc., closes #130) do NOT fire — meaning the empty-content gate intercepts BEFORE tool execution. File never written.

## Code path

`agent/conversation_loop.py` ~ line 3782-3870 (read on v0.18.6 HEAD: `38bcb1e82`):

```python
# Line 3782-3784
_truly_empty = not agent._strip_think_blocks(
    final_response
).strip()
```

`_truly_empty` is computed from `final_response` (the assistant message's content string only). **It does not check `assistant_message.tool_calls`.** A response with `content=""` (or empty after `<think>` stripping) + populated `tool_calls` evaluates to `_truly_empty = True`.

```python
# Line 3842 — _structural_empty branch
_structural_empty = (
    _truly_empty
    and not _has_structured
    and finish_reason == "stop"      # ← correctly gated against tool_calls responses
    and not _prior_was_tool
    and _tools_attached
    and not getattr(agent, "_tools_empty_terminal_handled", False)
)
```

`_structural_empty` correctly gates on `finish_reason == "stop"` — so this branch does NOT fire for `finish_reason="tool_calls"` responses. ✓

```python
# Line 3870-ish — broader empty-content retry path
if _truly_empty and (not _has_structured or _prefill_exhausted) and agent._empty_content_retries < 3:
    agent._empty_content_retries += 1
    ...
```

**This branch does fire for `tool_calls` responses** — there's no tool_calls gate. `_truly_empty = True` (content empty after strip), `not _has_structured = True` (no reasoning fields), retries increment. Then the synthetic recovery prompt at the line ~3856 path emits "Your previous response was empty".

The user sees the recovery loop: codestral → empty content + tool_calls → hermes treats as empty → re-prompts → mistral-large emits prose ("Creating /tmp/random_test.py...") + tool_call → hermes treats as empty again → loop until 3 retries exhaust.

## Why v0.18.4's recovery (#122) doesn't help

`agent/transports/chat_completions.py::normalize_response` correctly recovers SDK-dropped tool_calls and populates `NormalizedResponse.tool_calls`. That's verified. But `_truly_empty` in conversation_loop is computed from `final_response` (content string), not from `assistant_message.tool_calls`. Even with tool_calls fully populated, the gate doesn't know.

## Proposed fix (sketch)

Either of:

**(a) Tighten `_truly_empty`** to include tool_calls absence:

```python
_truly_empty = (
    not agent._strip_think_blocks(final_response).strip()
    and not getattr(assistant_message, "tool_calls", None)
)
```

**(b) Tighten the retry gate** to skip when tool_calls are present:

```python
if _truly_empty and not getattr(assistant_message, "tool_calls", None) \
   and (not _has_structured or _prefill_exhausted) \
   and agent._empty_content_retries < 3:
    ...
```

Either fix makes the broader empty-content retry path consistent with the existing `_structural_empty` gate (which already correctly excludes tool_calls responses via `finish_reason == "stop"`).

(a) is cleaner — `_truly_empty` becomes the single source of truth for "the response carries nothing useful". (b) is narrower — only the retry path changes.

## Verified end-to-end

Earlier diagnostic + my in-process probes confirmed:
- mistral-shaped tool_calls (with #336 type-normalization) parse cleanly through hermes' OpenAI SDK
- hermes' `normalize_response` populates tool_calls correctly
- hermes' `write_file` handler accepts mistral's args + writes the file (verified by direct invocation in `ssh dev`)
- The whole chain works in isolation. Only the gate above blocks it in production.

## Companion devagentic-side observability

devagentic#337 / PR devagentic#338 adds a `[debug-response-shape]` log at the OAI shim exit dumping the full wire shape (finish_reason, content type/nullness/length, tool_call_0 keys + type field). Helps hermes-maint verify what hermes RECEIVES vs what parser expects.

## Repro (from the gist)

```bash
docker exec -u duplex devagentic-duplex-claude tmux send-keys -t sandbox \
  "Write /tmp/random_test.py with: import random; print(random.randint(1,100)) - then execute it." Enter
sleep 30
docker exec -u duplex devagentic-duplex-claude tmux capture-pane -t sandbox -p -S -40
ssh dev "grep cascade-entry /tmp/service.log | tail -5"
```

Expected post-fix: `[tool-dispatch]` lines from v0.18.6 fire, file gets written.

## Related

- gist sandbox-finding.md (full diagnostic)
- hermes-agent#122 (v0.18.3 SDK recovery — works correctly)
- hermes-agent#125 (v0.18.4 drop finish_reason gate — broadened recovery)
- hermes-agent#128 (v0.18.5 loud invalid-tool diagnostic — confirmed names valid)
- hermes-agent#131 (v0.18.6 tool-execution diagnostic — doesn't fire because gate intercepts before)
- devagentic#337 / PR devagentic#338 — companion wire-shape log

🤖 Generated with [Claude Code](https://claude.com/claude-code)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

conversation_loop empty-content gate ignores tool_calls — recovery loop on codestral content_len=0 + tool_calls_count>=1 responses #133

Symptom

Code path

Why v0.18.4's recovery (#122) doesn't help

Proposed fix (sketch)

Verified end-to-end

Companion devagentic-side observability

Repro (from the gist)

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

conversation_loop empty-content gate ignores tool_calls — recovery loop on codestral content_len=0 + tool_calls_count>=1 responses #133

Description

Symptom

Code path

Why v0.18.4's recovery (#122) doesn't help

Proposed fix (sketch)

Verified end-to-end

Companion devagentic-side observability

Repro (from the gist)

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions