Skip to content

fix(agent): add qwen and deepseek to TOOL_USE_ENFORCEMENT_MODELS (#28195)#28348

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-3ad7d98a
May 19, 2026
Merged

fix(agent): add qwen and deepseek to TOOL_USE_ENFORCEMENT_MODELS (#28195)#28348
teknium1 merged 1 commit into
mainfrom
hermes/hermes-3ad7d98a

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Salvage of #28195 by @briandevans (squashed 2-commit stack: fix + test improvement). Supersedes already-closed #28081.

What: Qwen3.x and DeepSeek-V3.x default to chatty/hallucinatory tool use without enforcement steering — agents narrate "calling tool X" without actually emitting a tool call, or run partial loops.

How: Add qwen and deepseek substrings to TOOL_USE_ENFORCEMENT_MODELS in agent/prompt_builder.py. Two new unit tests in test_prompt_builder.py and two integration tests in test_run_agent.py verify auto-mode injection for each model family.

Original PR: #28195
Fixes #28079.

Qwen3.x and DeepSeek-V3.x default to chatty/hallucinatory tool use without
enforcement steering — agents narrate "calling tool X" without actually
emitting a tool call, or run partial loops. Both model families fit the
same failure pattern TOOL_USE_ENFORCEMENT_GUIDANCE was already injected
for (gpt, codex, gemini, gemma, grok, glm).

Co-authored-by: briandevans <252620095+briandevans@users.noreply.github.com>

Squashed salvage of:
- 403e567 fix(agent): add qwen and deepseek to TOOL_USE_ENFORCEMENT_MODELS
- 9433eab test(agent): use realistic qwen-plus identifier in enforcement test

Fixes #28079.
@teknium1 teknium1 merged commit 7569007 into main May 19, 2026
4 checks passed
@teknium1 teknium1 deleted the hermes/hermes-3ad7d98a branch May 19, 2026 03:06
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-3ad7d98a vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8790 on HEAD, 8790 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4626 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug/Regression] tool_use_enforcement auto-mode excludes Qwen/DeepSeek causing hallucination

2 participants