Skip to content

fix(prompt_builder): inject tool-use enforcement for GLM models#23702

Closed
ALIYILD wants to merge 1 commit into
NousResearch:mainfrom
ALIYILD:fix/tool-use-enforcement-glm
Closed

fix(prompt_builder): inject tool-use enforcement for GLM models#23702
ALIYILD wants to merge 1 commit into
NousResearch:mainfrom
ALIYILD:fix/tool-use-enforcement-glm

Conversation

@ALIYILD

@ALIYILD ALIYILD commented May 11, 2026

Copy link
Copy Markdown
Contributor

Summary

Add "glm" to TOOL_USE_ENFORCEMENT_MODELS so GLM-family models receive the same "you must call tools, not describe intentions" system-prompt injection that already protects gpt/codex/gemini/gemma/grok.

Motivation — real-world failure mode

Running a multi-profile DeepSynaps audit team where three engineer profiles (shipbackend, frontend-engineer, backend-engineer) defaulted to z-ai/glm-4.5-flash:free and z-ai/glm-4.5-air:free, I observed seven consecutive kanban tasks fail with the identical dispatcher error:

worker exited cleanly (rc=0) without calling kanban_complete or kanban_block — protocol violation

The pattern was unmistakable: model produces work output as text, but never invokes the terminal kanban_complete / kanban_block tool call that the dispatcher requires. The kanban-worker skill (auto-loaded for every dispatched worker) and the KANBAN_GUIDANCE block (auto-injected from prompt_builder.py) both spell out the protocol — but GLM-family models don't reliably comply without the same explicit "every response must contain a tool call or final result" enforcement that already exists for other tool-use-weak families.

Re-running the exact same tasks on anthropic/claude-haiku-4.5 resolved all seven immediately. The tasks themselves were trivially solvable — the failure was protocol compliance, not capability.

Diagnosis

TOOL_USE_ENFORCEMENT_GUIDANCE (lines 254-267) is purpose-built for this failure mode. It's gated by substring match against TOOL_USE_ENFORCEMENT_MODELS, which today catches GPT, Codex, Gemini, Gemma, and Grok. GLM is symmetric to those families in this respect — when the dispatcher injects KANBAN_GUIDANCE for a free GLM worker, the model still describes work in prose without making the closing tool call.

Adding "glm" to the tuple flips the enforcement injection on for any model whose name contains glm (substring match). This covers z-ai/glm-4.5-air, z-ai/glm-4.5-flash, z-ai/glm-4.6, ChatGLM variants, etc.

Change

-TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex", "gemini", "gemma", "grok")
+TOOL_USE_ENFORCEMENT_MODELS = ("gpt", "codex", "gemini", "gemma", "grok", "glm")

One line. No behavior change for non-GLM models.

Test plan

  • Verified locally: same three engineer profiles re-tiered to Claude Haiku resolved all seven tasks with zero protocol violations.
  • Verified locally: with this patch applied, the GLM-A/B kimi-consent-engineer profile (testing kimi-k2.6 as cheaper alternative) gets enforcement injection if glm appears in its model id.
  • Maintainer test: run any dispatched kanban task with --model z-ai/glm-4.5-flash:free (or any glm-named model) and confirm the TOOL_USE_ENFORCEMENT_GUIDANCE block appears in the system prompt.

Alternatives considered

  1. Per-profile agent.tool_use_enforcement: true override — works, but requires every operator to discover the bug independently. The hardcoded list exists precisely to default-on enforcement for known-weak families; GLM belongs in it.
  2. Bump GLM to a paid tier — solves it for one operator's workload, but doesn't help anyone else. The free tier is GLM's primary on-ramp.

🤖 Generated with Claude Code

GLM-family models (z-ai/glm-4.5-air, z-ai/glm-4.5-flash, etc.) exhibit
the same "describe-instead-of-call" failure mode that gpt/codex/gemini/
gemma/grok already trigger enforcement for. Without the injection,
free-tier GLM workers spawned by the kanban dispatcher routinely exit
cleanly (rc=0) without invoking kanban_complete or kanban_block,
producing the "protocol violation" error and triggering the dispatcher's
gave_up path.

Observed in real workloads: seven consecutive kanban tasks across three
GLM-tier profiles (shipbackend, frontend-engineer, backend-engineer) all
failed with the identical message:

    worker exited cleanly (rc=0) without calling kanban_complete or
    kanban_block — protocol violation

Re-running the same tasks on Claude Haiku immediately resolved them.
Adding "glm" to TOOL_USE_ENFORCEMENT_MODELS closes the gap so future
GLM-routed work receives the explicit "every response must contain a
tool call or final result" steering that already protects the other
enforcement-gated model families.

One-line change; no behavior change for non-GLM models.
@alt-glitch alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/agent Core agent loop, run_agent.py, prompt builder provider/zai ZAI provider labels May 11, 2026
@teknium1

Copy link
Copy Markdown
Contributor

Merged via PR #24715 (cherry-picked onto current main with your authorship preserved). Thanks for the contribution!

@teknium1 teknium1 closed this May 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have provider/zai ZAI provider type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants