Bug: Ollama cloud model glm-5.1 produces malformed JSON tool calls in long contexts

## Description

When using the `glm-5.1:cloud` model (via Ollama cloud routing) as the primary reasoning agent, tool calls with complex or lengthy argument values produce malformed JSON. This causes tool call parsing failures and can crash the agent loop.

## Steps to Reproduce

1. Configure Hermes with `glm-5.1:cloud` as the reasoning model (via Ollama ACP)
2. Engage in a long conversation session (many turns, context approaching limits)
3. Ask the agent to perform a complex task that requires a tool call with a large JSON argument (e.g., `put_page` with long markdown content)
4. Observe malformed JSON in the tool call arguments — often truncation, missing closing braces, or escaped quotes breaking the structure

## Expected Behavior

Tool call arguments should always be valid JSON, regardless of context length or argument size.

## Actual Behavior

The model's tool call JSON output becomes malformed when:
- The context window fills up (many turns of conversation)
- The tool call arguments are large (e.g., long strings, multi-line content)
- The JSON values contain special characters that the model fails to escape properly

Common malformations observed:
- Truncated JSON (missing closing `}`)
- Double-escaped quotes (`\\"` instead of `"`)
- Raw newlines inside JSON string values (should be `\\n`)
- Missing commas between JSON object properties

## Impact

This makes the agent unreliable in long sessions when using `glm-5.1:cloud`. The agent may:
- Fail silently (tool call dropped)
- Retry the same malformed call repeatedly
- Eventually hit max retries and give up

## Root Cause

This appears to be a model-level issue with GLM-5.1's structured output generation degrading in long contexts, not a Hermes parsing bug. The same model works fine for short-context tool calls.

## Suggested Mitigations in Hermes

1. **JSON repair on tool call parse failure**: Attempt to fix common JSON malformations (truncation, missing braces) before failing
2. **Model-specific context length limits**: Enforce a lower effective context limit for models known to degrade, auto-compacting earlier
3. **Fallback routing**: When a tool call parse fails N times, retry with a different model or with a simplified prompt
4. **Validation layer**: Validate tool call JSON against the expected schema before dispatching

## Workaround

1. Use shorter sessions / clear context more frequently
2. For complex tasks, use `delegate_task` with a model that handles long contexts better (e.g., Claude Code ACP)
3. Break large tool call arguments into smaller chunks

## Environment

- Hermes Agent v0.10.0
- Model: glm-5.1:cloud (via Ollama cloud)
- Transport: Ollama ACP
- OS: macOS

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Ollama cloud model glm-5.1 produces malformed JSON tool calls in long contexts #13042

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Impact

Root Cause

Suggested Mitigations in Hermes

Workaround

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Bug: Ollama cloud model glm-5.1 produces malformed JSON tool calls in long contexts #13042

Description

Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Impact

Root Cause

Suggested Mitigations in Hermes

Workaround

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions