Skip to content

feat: make tool-use enforcement configurable (agent.tool_use_enforcement)#3551

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-f5a4cd91
Mar 28, 2026
Merged

feat: make tool-use enforcement configurable (agent.tool_use_enforcement)#3551
teknium1 merged 1 commit into
mainfrom
hermes/hermes-f5a4cd91

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

Makes the tool-use enforcement prompt injection (from #3528) configurable via config.yaml instead of being hardcoded to GPT/Codex models only.

New config key: agent.tool_use_enforcement

Value Behavior
"auto" (default) Matches gpt/codex model names — existing behavior
true Inject for all models
false Never inject
["gpt", "deepseek", "qwen"] Custom list of model-name substrings

Example config.yaml

agent:
  tool_use_enforcement: true  # force for all models
  # or: ["gpt", "codex", "deepseek", "gemini"]  # specific models

Handles YAML bool coercion (true/false/on/off/yes/no all work). No config version bump needed — deep merge provides the default for existing installs.

Changes

  • hermes_cli/config.py: Add tool_use_enforcement: "auto" to DEFAULT_CONFIG["agent"]
  • run_agent.py: Read config in __init__, use it in _build_system_prompt() instead of hardcoded model check
  • tests/test_run_agent.py: 12 new tests covering all config modes (auto/true/false/list/case-insensitive/no-tools)

…cement

The TOOL_USE_ENFORCEMENT_GUIDANCE injection (added in #3528) was
hardcoded to only match gpt/codex model names. This makes it a
config option so users can turn it on for any model family.

New config key: agent.tool_use_enforcement
  - "auto" (default): matches gpt/codex (existing behavior)
  - true: inject for all models
  - false: never inject
  - list of strings: custom model-name substrings to match
    e.g. ["gpt", "codex", "deepseek", "qwen"]

No version bump needed — deep merge provides the default
automatically for existing installs.

12 new tests covering all config modes.
@github-actions

Copy link
Copy Markdown
Contributor

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Install hook files modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

@teknium1 teknium1 merged commit 901494d into main Mar 28, 2026
4 checks passed
teknium1 added a commit that referenced this pull request Mar 28, 2026
… pages

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from #3551/#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from #3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  #3572/#3573/#3576/#3580/#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (#3583).
- skills.md: List default GitHub taps including garrytan/gstack (#3605).
teknium1 added a commit that referenced this pull request Mar 28, 2026
… pages (#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from #3551/#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from #3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  #3572/#3573/#3576/#3580/#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (#3583).
- skills.md: List default GitHub taps including garrytan/gstack (#3605).
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
…cement (NousResearch#3551)

The TOOL_USE_ENFORCEMENT_GUIDANCE injection (added in NousResearch#3528) was
hardcoded to only match gpt/codex model names. This makes it a
config option so users can turn it on for any model family.

New config key: agent.tool_use_enforcement
  - "auto" (default): matches gpt/codex (existing behavior)
  - true: inject for all models
  - false: never inject
  - list of strings: custom model-name substrings to match
    e.g. ["gpt", "codex", "deepseek", "qwen"]

No version bump needed — deep merge provides the default
automatically for existing installs.

12 new tests covering all config modes.
angelburgosrosado pushed a commit to angelburgosrosado/hermes-agent that referenced this pull request Apr 27, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
…cement (NousResearch#3551)

The TOOL_USE_ENFORCEMENT_GUIDANCE injection (added in NousResearch#3528) was
hardcoded to only match gpt/codex model names. This makes it a
config option so users can turn it on for any model family.

New config key: agent.tool_use_enforcement
  - "auto" (default): matches gpt/codex (existing behavior)
  - true: inject for all models
  - false: never inject
  - list of strings: custom model-name substrings to match
    e.g. ["gpt", "codex", "deepseek", "qwen"]

No version bump needed — deep merge provides the default
automatically for existing installs.

12 new tests covering all config modes.
02356abc pushed a commit to 02356abc/hermes-agent that referenced this pull request May 14, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
…cement (NousResearch#3551)

The TOOL_USE_ENFORCEMENT_GUIDANCE injection (added in NousResearch#3528) was
hardcoded to only match gpt/codex model names. This makes it a
config option so users can turn it on for any model family.

New config key: agent.tool_use_enforcement
  - "auto" (default): matches gpt/codex (existing behavior)
  - true: inject for all models
  - false: never inject
  - list of strings: custom model-name substrings to match
    e.g. ["gpt", "codex", "deepseek", "qwen"]

No version bump needed — deep merge provides the default
automatically for existing installs.

12 new tests covering all config modes.
olympus-terminal pushed a commit to olympus-terminal/hermes-agent that referenced this pull request May 16, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…cement (NousResearch#3551)

The TOOL_USE_ENFORCEMENT_GUIDANCE injection (added in NousResearch#3528) was
hardcoded to only match gpt/codex model names. This makes it a
config option so users can turn it on for any model family.

New config key: agent.tool_use_enforcement
  - "auto" (default): matches gpt/codex (existing behavior)
  - true: inject for all models
  - false: never inject
  - list of strings: custom model-name substrings to match
    e.g. ["gpt", "codex", "deepseek", "qwen"]

No version bump needed — deep merge provides the default
automatically for existing installs.

12 new tests covering all config modes.
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
…cement (NousResearch#3551)

The TOOL_USE_ENFORCEMENT_GUIDANCE injection (added in NousResearch#3528) was
hardcoded to only match gpt/codex model names. This makes it a
config option so users can turn it on for any model family.

New config key: agent.tool_use_enforcement
  - "auto" (default): matches gpt/codex (existing behavior)
  - true: inject for all models
  - false: never inject
  - list of strings: custom model-name substrings to match
    e.g. ["gpt", "codex", "deepseek", "qwen"]

No version bump needed — deep merge provides the default
automatically for existing installs.

12 new tests covering all config modes.
Egavasyug pushed a commit to Egavasyug/hermes-agent that referenced this pull request Jun 10, 2026
… pages (NousResearch#3618)

Fixes found by auditing docs against recent PRs/commits:

Critical (misleading):
- hooks.md: Remove stale 'planned — not yet wired' markers for 4 hooks
  that are now active (NousResearch#3542). Add correct callback signatures.
- security.md: Update tirith verdict behavior — block verdicts now go
  through approval flow instead of hard-blocking (NousResearch#3428). Add pkill/killall
  self-termination guard and gateway-run backgrounding patterns (NousResearch#3593).

New feature docs:
- configuration.md: Add tool_use_enforcement section with value table
  (auto/true/false/list) from NousResearch#3551/NousResearch#3528.
- configuration.md: Expand auxiliary config with per-task timeouts
  (compression 120s, web_extract 30s, approval 30s) from NousResearch#3597.
- api-server.md: Add /v1/health alias, Security Headers section,
  CORS details (Max-Age, SSE headers, Idempotency-Key) from
  NousResearch#3572/NousResearch#3573/NousResearch#3576/NousResearch#3580/NousResearch#3530.

Stale/incomplete:
- configuration.md: Fix Alibaba model name qwen-plus -> qwen3.5-plus (NousResearch#3484).
- environment-variables.md: Specify actual DashScope default URL.
- cli-commands.md: Add alibaba to --provider list.
- fallback-providers.md: Add Alibaba/DashScope to provider table.
- email.md: Document noreply/automated sender filtering (NousResearch#3606).
- toolsets-reference.md: Add 4 missing platform toolsets — matrix,
  mattermost, dingtalk, api-server (NousResearch#3583).
- skills.md: List default GitHub taps including garrytan/gstack (NousResearch#3605).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant