feat: add grok to TOOL_USE_ENFORCEMENT_MODELS for direct xAI usage#5595
Merged
Conversation
Grok models (x-ai/grok-4.20-beta, grok-code-fast-1) now receive tool-use enforcement guidance, steering them to actually call tools instead of describing intended actions. Matches both OpenRouter (x-ai/grok-*) and direct xAI API usage.
Contributor
|
Julientalbot
pushed a commit
to Julientalbot/hermes-agent
that referenced
this pull request
Apr 10, 2026
Grok reasoning models have a failure mode where they describe planned
actions in text ("I will check X", "Je vais lancer Y") without
actually calling the corresponding tools. The existing
TOOL_USE_ENFORCEMENT_GUIDANCE mitigates the "action reflex" trait
(NousResearch#5595) but doesn't address the narration-vs-execution split that is
specific to reasoning architectures.
Add GROK_EXECUTION_GUIDANCE — a targeted system prompt block injected
alongside TOOL_USE_ENFORCEMENT_GUIDANCE when the model name contains
"grok". Three XML-tagged sections:
- <no_intent_phrases>: explicit list of forbidden phrases in English
and French ("I will...", "Let me...", "Je vais...", etc.) with
the rule: if you need to act, call the tool now; do not narrate
the intent.
- <execute_first>: mandate that the first response to any work-implying
request contain a tool call, not a plan. Chain multiple tool calls
in the same turn without intermediate prose.
- <no_analysis_hallucination>: forbid structured analyses, diagnosis
lists, or recommendations produced from pure reasoning without tool
calls to verify the claims.
Injected in run_agent.py next to the existing provider-specific guidance
blocks (OPENAI_MODEL_EXECUTION_GUIDANCE, GOOGLE_MODEL_OPERATIONAL_GUIDANCE).
Tests (6 new in TestGrokExecutionGuidance):
- Verifies XML tag structure
- Asserts intent-phrase examples are present in both English and French
- Asserts the execute-first mandate is documented
- Asserts the no-analysis-hallucination rule is present
- Size and type checks
124 passed, 1 skipped in tests/agent/test_prompt_builder.py (no regression).
NOT YET PUSHED as a PR. To be dogfooded on the author's production
instance on xAI before upstream submission, given the precedent of
'behavioral' patches being classified as prostheses in prior work.
Tommyeds
pushed a commit
to Tommyeds/hermes-agent
that referenced
this pull request
Apr 12, 2026
…ousResearch#5595) Grok models (x-ai/grok-4.20-beta, grok-code-fast-1) now receive tool-use enforcement guidance, steering them to actually call tools instead of describing intended actions. Matches both OpenRouter (x-ai/grok-*) and direct xAI API usage.
angelburgosrosado
pushed a commit
to angelburgosrosado/hermes-agent
that referenced
this pull request
Apr 27, 2026
…ousResearch#5595) Grok models (x-ai/grok-4.20-beta, grok-code-fast-1) now receive tool-use enforcement guidance, steering them to actually call tools instead of describing intended actions. Matches both OpenRouter (x-ai/grok-*) and direct xAI API usage.
02356abc
pushed a commit
to 02356abc/hermes-agent
that referenced
this pull request
May 14, 2026
…ousResearch#5595) Grok models (x-ai/grok-4.20-beta, grok-code-fast-1) now receive tool-use enforcement guidance, steering them to actually call tools instead of describing intended actions. Matches both OpenRouter (x-ai/grok-*) and direct xAI API usage.
olympus-terminal
pushed a commit
to olympus-terminal/hermes-agent
that referenced
this pull request
May 16, 2026
…ousResearch#5595) Grok models (x-ai/grok-4.20-beta, grok-code-fast-1) now receive tool-use enforcement guidance, steering them to actually call tools instead of describing intended actions. Matches both OpenRouter (x-ai/grok-*) and direct xAI API usage.
19 tasks
5 tasks
gweeteve
pushed a commit
to gweeteve/hermes-agent
that referenced
this pull request
Jun 2, 2026
…ousResearch#5595) Grok models (x-ai/grok-4.20-beta, grok-code-fast-1) now receive tool-use enforcement guidance, steering them to actually call tools instead of describing intended actions. Matches both OpenRouter (x-ai/grok-*) and direct xAI API usage.
Egavasyug
pushed a commit
to Egavasyug/hermes-agent
that referenced
this pull request
Jun 10, 2026
…ousResearch#5595) Grok models (x-ai/grok-4.20-beta, grok-code-fast-1) now receive tool-use enforcement guidance, steering them to actually call tools instead of describing intended actions. Matches both OpenRouter (x-ai/grok-*) and direct xAI API usage.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
"grok"to theTOOL_USE_ENFORCEMENT_MODELStuple so Grok models receive tool-use enforcement guidance in the system prompt.Closes #5531
What changed
agent/prompt_builder.py: Added"grok"toTOOL_USE_ENFORCEMENT_MODELStests/agent/test_prompt_builder.py: Added assertion test for grok inclusionWhy
Grok models (
x-ai/grok-4.20-beta,grok-code-fast-1) accessed via OpenRouter or direct xAI API were not getting the tool-use enforcement guidance that steers models to actually call tools instead of describing intended actions. The substring match on"grok"covers both routing paths.Test plan
python -m pytest tests/agent/test_prompt_builder.py -n0 -q— 119 passed