Bug
Hermes-cli surfaces "returned no content" + retries when an upstream OpenAI-compat response has message.content == "" AND message.tool_calls != []. The current handler appears to only check content; tool_calls is ignored. That's not the OpenAI-compat contract — a response with tool_calls populated means the model is asking to call those tools (not that there was no response).
Concrete repro (production, observed today)
Polynomial-explorer's session against devagentic's blackboxed OAI-compat surface:
Distinguisher: when tools are attached AND the prompt invites enumeration / structured output, mistral-large emits tool_calls instead of content. The response is structurally valid OpenAI shape; hermes-cli's read of it isn't.
Root cause
The hermes-cli response handler reads message.content; sees empty; flags "no content"; retries. Never inspects message.tool_calls. The OpenAI client contract is:
message.content is non-empty AND message.tool_calls is empty/null → text response
message.content is empty AND message.tool_calls is populated → model wants tools called; client should either invoke them or surface the call
- Both non-empty → both happened (model emitted some text AND wants tool calls)
- Both empty → THIS is the "no content" case (legitimately empty response, often from content filter)
The current behavior conflates case 2 with case 4.
Proposed fix shape
In hermes-cli's chat-completion response handler (wherever the empty-content check lives):
content = message.get("content") or ""
tool_calls = message.get("tool_calls") or []
if not content.strip() and not tool_calls:
# genuine no-content; existing retry / "returned no content" path
...
elif tool_calls:
# model wants tools called — execute, or surface the structured
# call to the user. Either path stops the "no content" retry loop.
...
else:
# content is present — existing happy path
...
Specific TUI behavior choice for tool_calls (when no automated tool-executor is wired into hermes-cli):
- Option 1: render the tool_calls as a structured block in the conversation ("the model wants to call tool X with args Y"). Lets the user see what the model intended.
- Option 2: silently retry once with
tools: [] stripped from the request so the model is forced to produce content instead. Loses the tool-use intent but keeps the user unblocked.
- Option 3: error explicitly "model wants to call tools, but no executor is wired" with the parsed tool_calls in the message. Operator-honest.
Lean Option 1 for the user-facing path (preserves model intent) + Option 3 as a clear error if rendering isn't possible.
Composition
- hermes#89 (predecessor) — same pattern surfaced earlier with
coding-groq / coding-gpt54; that one got fixed devagentic-side via the code-role fence (devagentic#224) which strips tools before the upstream call. This new manifestation is mistral-large + tool-bearing prompts; the right fix is client-side because:
mistral-large is the canonical non-fenced default per devagentic#284; we don't want to fence it
- The response shape is OpenAI-correct; the client should handle it
- devagentic#NN (sibling observation on the devagentic repo)
Acceptance
- Response with empty
content + populated tool_calls no longer surfaces "returned no content"
- Either: TUI renders the tool_calls structurally (Option 1) OR explicit error (Option 3)
- No regression on the existing "both empty" / "content-only" / "tool_calls-only-with-text-content" paths
- The polynomial-explorer enumeration prompt no longer produces 3-4 silent retries
Bug
Hermes-cli surfaces "returned no content" + retries when an upstream OpenAI-compat response has
message.content == ""ANDmessage.tool_calls != []. The current handler appears to only checkcontent;tool_callsis ignored. That's not the OpenAI-compat contract — a response withtool_callspopulated means the model is asking to call those tools (not that there was no response).Concrete repro (production, observed today)
Polynomial-explorer's session against devagentic's blackboxed OAI-compat surface:
"Describe Eisenstein's criterion","What is the relationship...") → routed tomistral-large(per devagentic#284 / feat(bridge): add Hermes JSONL bridge with native/synthetic streaming… NousResearch/hermes-agent#287 non-fenced routing) → 200 with clean content ✓"Name three polynomial transformations.") → routed tomistral-large→ 200 with empty content + populatedmessage.tool_calls→ hermes-cli reports "returned no content" + retries × 3-4 timesDistinguisher: when tools are attached AND the prompt invites enumeration / structured output, mistral-large emits
tool_callsinstead of content. The response is structurally valid OpenAI shape; hermes-cli's read of it isn't.Root cause
The hermes-cli response handler reads
message.content; sees empty; flags "no content"; retries. Never inspectsmessage.tool_calls. The OpenAI client contract is:message.contentis non-empty ANDmessage.tool_callsis empty/null → text responsemessage.contentis empty ANDmessage.tool_callsis populated → model wants tools called; client should either invoke them or surface the callThe current behavior conflates case 2 with case 4.
Proposed fix shape
In hermes-cli's chat-completion response handler (wherever the empty-content check lives):
Specific TUI behavior choice for tool_calls (when no automated tool-executor is wired into hermes-cli):
tools: []stripped from the request so the model is forced to produce content instead. Loses the tool-use intent but keeps the user unblocked.Lean Option 1 for the user-facing path (preserves model intent) + Option 3 as a clear error if rendering isn't possible.
Composition
coding-groq/coding-gpt54; that one got fixed devagentic-side via the code-role fence (devagentic#224) which strips tools before the upstream call. This new manifestation ismistral-large+ tool-bearing prompts; the right fix is client-side because:mistral-largeis the canonical non-fenced default perdevagentic#284; we don't want to fence itAcceptance
content+ populatedtool_callsno longer surfaces "returned no content"