Skip to content

Bug: Ollama cloud model glm-5.1 produces malformed JSON tool calls in long contexts #13042

@lrawnsley

Description

@lrawnsley

Description

When using the glm-5.1:cloud model (via Ollama cloud routing) as the primary reasoning agent, tool calls with complex or lengthy argument values produce malformed JSON. This causes tool call parsing failures and can crash the agent loop.

Steps to Reproduce

  1. Configure Hermes with glm-5.1:cloud as the reasoning model (via Ollama ACP)
  2. Engage in a long conversation session (many turns, context approaching limits)
  3. Ask the agent to perform a complex task that requires a tool call with a large JSON argument (e.g., put_page with long markdown content)
  4. Observe malformed JSON in the tool call arguments — often truncation, missing closing braces, or escaped quotes breaking the structure

Expected Behavior

Tool call arguments should always be valid JSON, regardless of context length or argument size.

Actual Behavior

The model's tool call JSON output becomes malformed when:

  • The context window fills up (many turns of conversation)
  • The tool call arguments are large (e.g., long strings, multi-line content)
  • The JSON values contain special characters that the model fails to escape properly

Common malformations observed:

  • Truncated JSON (missing closing })
  • Double-escaped quotes (\\" instead of ")
  • Raw newlines inside JSON string values (should be \\n)
  • Missing commas between JSON object properties

Impact

This makes the agent unreliable in long sessions when using glm-5.1:cloud. The agent may:

  • Fail silently (tool call dropped)
  • Retry the same malformed call repeatedly
  • Eventually hit max retries and give up

Root Cause

This appears to be a model-level issue with GLM-5.1's structured output generation degrading in long contexts, not a Hermes parsing bug. The same model works fine for short-context tool calls.

Suggested Mitigations in Hermes

  1. JSON repair on tool call parse failure: Attempt to fix common JSON malformations (truncation, missing braces) before failing
  2. Model-specific context length limits: Enforce a lower effective context limit for models known to degrade, auto-compacting earlier
  3. Fallback routing: When a tool call parse fails N times, retry with a different model or with a simplified prompt
  4. Validation layer: Validate tool call JSON against the expected schema before dispatching

Workaround

  1. Use shorter sessions / clear context more frequently
  2. For complex tasks, use delegate_task with a model that handles long contexts better (e.g., Claude Code ACP)
  3. Break large tool call arguments into smaller chunks

Environment

  • Hermes Agent v0.10.0
  • Model: glm-5.1:cloud (via Ollama cloud)
  • Transport: Ollama ACP
  • OS: macOS

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existsprovider/ollamaOllama / local modelssweeper:implemented-on-mainSweeper: behavior already present on current maintype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions