fix(prompt_builder): inject tool-use enforcement for GLM models#23702
Closed
ALIYILD wants to merge 1 commit into
Closed
fix(prompt_builder): inject tool-use enforcement for GLM models#23702ALIYILD wants to merge 1 commit into
ALIYILD wants to merge 1 commit into
Conversation
GLM-family models (z-ai/glm-4.5-air, z-ai/glm-4.5-flash, etc.) exhibit
the same "describe-instead-of-call" failure mode that gpt/codex/gemini/
gemma/grok already trigger enforcement for. Without the injection,
free-tier GLM workers spawned by the kanban dispatcher routinely exit
cleanly (rc=0) without invoking kanban_complete or kanban_block,
producing the "protocol violation" error and triggering the dispatcher's
gave_up path.
Observed in real workloads: seven consecutive kanban tasks across three
GLM-tier profiles (shipbackend, frontend-engineer, backend-engineer) all
failed with the identical message:
worker exited cleanly (rc=0) without calling kanban_complete or
kanban_block — protocol violation
Re-running the same tasks on Claude Haiku immediately resolved them.
Adding "glm" to TOOL_USE_ENFORCEMENT_MODELS closes the gap so future
GLM-routed work receives the explicit "every response must contain a
tool call or final result" steering that already protects the other
enforcement-gated model families.
One-line change; no behavior change for non-GLM models.
Contributor
|
Merged via PR #24715 (cherry-picked onto current main with your authorship preserved). Thanks for the contribution! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add
"glm"toTOOL_USE_ENFORCEMENT_MODELSso GLM-family models receive the same "you must call tools, not describe intentions" system-prompt injection that already protects gpt/codex/gemini/gemma/grok.Motivation — real-world failure mode
Running a multi-profile DeepSynaps audit team where three engineer profiles (
shipbackend,frontend-engineer,backend-engineer) defaulted toz-ai/glm-4.5-flash:freeandz-ai/glm-4.5-air:free, I observed seven consecutive kanban tasks fail with the identical dispatcher error:The pattern was unmistakable: model produces work output as text, but never invokes the terminal
kanban_complete/kanban_blocktool call that the dispatcher requires. Thekanban-workerskill (auto-loaded for every dispatched worker) and theKANBAN_GUIDANCEblock (auto-injected fromprompt_builder.py) both spell out the protocol — but GLM-family models don't reliably comply without the same explicit "every response must contain a tool call or final result" enforcement that already exists for other tool-use-weak families.Re-running the exact same tasks on
anthropic/claude-haiku-4.5resolved all seven immediately. The tasks themselves were trivially solvable — the failure was protocol compliance, not capability.Diagnosis
TOOL_USE_ENFORCEMENT_GUIDANCE(lines 254-267) is purpose-built for this failure mode. It's gated by substring match againstTOOL_USE_ENFORCEMENT_MODELS, which today catches GPT, Codex, Gemini, Gemma, and Grok. GLM is symmetric to those families in this respect — when the dispatcher injects KANBAN_GUIDANCE for a free GLM worker, the model still describes work in prose without making the closing tool call.Adding
"glm"to the tuple flips the enforcement injection on for any model whose name containsglm(substring match). This coversz-ai/glm-4.5-air,z-ai/glm-4.5-flash,z-ai/glm-4.6, ChatGLM variants, etc.Change
One line. No behavior change for non-GLM models.
Test plan
kimi-consent-engineerprofile (testingkimi-k2.6as cheaper alternative) gets enforcement injection ifglmappears in its model id.--model z-ai/glm-4.5-flash:free(or anyglm-named model) and confirm theTOOL_USE_ENFORCEMENT_GUIDANCEblock appears in the system prompt.Alternatives considered
agent.tool_use_enforcement: trueoverride — works, but requires every operator to discover the bug independently. The hardcoded list exists precisely to default-on enforcement for known-weak families; GLM belongs in it.🤖 Generated with Claude Code