Description
When using the glm-5.1:cloud model (via Ollama cloud routing) as the primary reasoning agent, tool calls with complex or lengthy argument values produce malformed JSON. This causes tool call parsing failures and can crash the agent loop.
Steps to Reproduce
- Configure Hermes with
glm-5.1:cloud as the reasoning model (via Ollama ACP)
- Engage in a long conversation session (many turns, context approaching limits)
- Ask the agent to perform a complex task that requires a tool call with a large JSON argument (e.g.,
put_page with long markdown content)
- Observe malformed JSON in the tool call arguments — often truncation, missing closing braces, or escaped quotes breaking the structure
Expected Behavior
Tool call arguments should always be valid JSON, regardless of context length or argument size.
Actual Behavior
The model's tool call JSON output becomes malformed when:
- The context window fills up (many turns of conversation)
- The tool call arguments are large (e.g., long strings, multi-line content)
- The JSON values contain special characters that the model fails to escape properly
Common malformations observed:
- Truncated JSON (missing closing
})
- Double-escaped quotes (
\\" instead of ")
- Raw newlines inside JSON string values (should be
\\n)
- Missing commas between JSON object properties
Impact
This makes the agent unreliable in long sessions when using glm-5.1:cloud. The agent may:
- Fail silently (tool call dropped)
- Retry the same malformed call repeatedly
- Eventually hit max retries and give up
Root Cause
This appears to be a model-level issue with GLM-5.1's structured output generation degrading in long contexts, not a Hermes parsing bug. The same model works fine for short-context tool calls.
Suggested Mitigations in Hermes
- JSON repair on tool call parse failure: Attempt to fix common JSON malformations (truncation, missing braces) before failing
- Model-specific context length limits: Enforce a lower effective context limit for models known to degrade, auto-compacting earlier
- Fallback routing: When a tool call parse fails N times, retry with a different model or with a simplified prompt
- Validation layer: Validate tool call JSON against the expected schema before dispatching
Workaround
- Use shorter sessions / clear context more frequently
- For complex tasks, use
delegate_task with a model that handles long contexts better (e.g., Claude Code ACP)
- Break large tool call arguments into smaller chunks
Environment
- Hermes Agent v0.10.0
- Model: glm-5.1:cloud (via Ollama cloud)
- Transport: Ollama ACP
- OS: macOS
Description
When using the
glm-5.1:cloudmodel (via Ollama cloud routing) as the primary reasoning agent, tool calls with complex or lengthy argument values produce malformed JSON. This causes tool call parsing failures and can crash the agent loop.Steps to Reproduce
glm-5.1:cloudas the reasoning model (via Ollama ACP)put_pagewith long markdown content)Expected Behavior
Tool call arguments should always be valid JSON, regardless of context length or argument size.
Actual Behavior
The model's tool call JSON output becomes malformed when:
Common malformations observed:
})\\"instead of")\\n)Impact
This makes the agent unreliable in long sessions when using
glm-5.1:cloud. The agent may:Root Cause
This appears to be a model-level issue with GLM-5.1's structured output generation degrading in long contexts, not a Hermes parsing bug. The same model works fine for short-context tool calls.
Suggested Mitigations in Hermes
Workaround
delegate_taskwith a model that handles long contexts better (e.g., Claude Code ACP)Environment