Skip to content

[Bug]: NVIDIA Build API modle z-ai/glm4.7 returns model hit max output tokens #9372

@an-gith

Description

@an-gith

Bug Description

When usingNVIDIA Build provider with model z-ai/glm4.7 , the API returns model hit max output tokens error, The model ID works on OpenClaw, but Hermes cannot callit successfully.

Steps to Reproduce

1.Run hermes setup and configure custom with NVIDIA Build API key

2.Select model z-ai/glm4.7

3.Try to chat: hernes chat or via gateway

4.Error: ⚠️ Response truncated (finish_reason='length') - model hit max output tokens
⚠️ Response truncated (finish_reason='length') - model hit max output tokens
⚠️ Response truncated (finish_reason='length') - model hit max output tokens
─ ⚕ Hermes ─────────────────────────────────────────────────────────────

 Error: Response remained truncated after 3 continuation attempts   

5.return to Ollama(cloud) model minimax-m2.7 ,changing nothing,it works

Expected Behavior

⚠️ Response truncated (finish_reason='length') - model hit max output tokens
⚠️ Response truncated (finish_reason='length') - model hit max output tokens
⚠️ Response truncated (finish_reason='length') - model hit max output tokens
─ ⚕ Hermes ─────────────────────────────────────────────────────────────

 Error: Response remained truncated after 3 continuation attempts   

Actual Behavior

Configuration (configyaml, env, hermes setup)

Affected Component

Setup / Installation

Messaging Platform (if gateway-related)

No response

Operating System

docker in synology(DSM 7.2.2-72806) docker

Python Version

3.13.5

Hermes Version

Hermes Agent v0.9.0 (v2026.4.13)

Relevant Logs / Traceback

⚠️  Response truncated (finish_reason='length') - model hit max output tokens
 ⚠️  Response truncated (finish_reason='length') - model hit max output tokens
 ⚠️  Response truncated (finish_reason='length') - model hit max output tokens
   ─  ⚕ Hermes  ───────────────────────────────────────────────────────────── 
                                                                              
     Error: Response remained truncated after 3 continuation attempts

Root Cause Analysis (optional)

No response

Proposed Fix (optional)

No response

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    sweeper:implemented-on-mainSweeper: behavior already present on current maintype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions