fix(skills-guard): let agent-created skills bypass dangerous-verdict confirmation by teknium1 · Pull Request #14538 · NousResearch/hermes-agent

teknium1 · 2026-04-23T12:18:35Z

Summary

Agent-authored skills (via skill_manage) no longer trigger the dangerous-verdict confirmation gate. The gate was meant for external skills pulled from GitHub via hermes skills install, where trust-level + verdict policies genuinely protect users. Agent-created skills run in the same process as the agent that wrote them — the agent could already execute the same code via terminal(), so the gate adds friction without meaningful security.

Root cause

INSTALL_POLICY in tools/skills_guard.py mapped agent-created + dangerous → ask, which tools/skill_manager_tool.py::_security_scan_skill treated as a block (returning an error string to the agent).

Concrete trigger

While writing a PR-review skill that described cache-busting and persistence semantics in prose, the scanner matched those words against its pattern list and blocked skill_manage(action='create'). The skill wasn't doing anything dangerous — it just documented what reviewers should watch for in OTHER code.

Changes

tools/skills_guard.py: agent-created dangerous verdict now maps to allow (with explanatory comment)
tests/tools/test_skills_guard.py: renamed test_dangerous_agent_created_asks → test_dangerous_agent_created_allowed; updated test_force_overrides_dangerous_for_agent_created → test_force_noop_for_agent_created_dangerous (force is now moot for agent-created since allow wins)

What still gets blocked

community source + caution/dangerous verdicts → block
trusted source + dangerous verdict → block
External hub installs are completely unaffected

Validation

tests/tools/test_skills_guard.py — 55/55 passing
Manually verified skill_manage(action='create') now succeeds on a skill with persistence / cache-busting / risky keywords in prose

…firmation The security scanner is meant to protect against hostile external skills pulled from GitHub via hermes skills install — trusted/community policies block or ask on dangerous verdicts accordingly. But agent-created skills (from skill_manage) run in the same process as the agent that wrote them. The agent can already execute the same code paths via terminal() with no gate, so the ask-on-dangerous policy adds friction without meaningful security. Concrete trigger: an agent writing a PR-review skill that describes cache-busting or persistence semantics in prose gets blocked because those words appear in the patterns list. The skill isn't actually doing anything dangerous — it's just documenting what reviewers should watch for in other PRs. Change: agent-created dangerous verdict maps to 'allow' instead of 'ask'. External hub installs (trusted/community) keep their stricter policies intact. Tests updated: renamed test_dangerous_agent_created_asks → test_dangerous_agent_created_allowed; renamed force-override test and updated assertion since force is now a no-op for agent-created (the allow branch returns first).

teknium1 merged commit e3c0084 into main Apr 23, 2026
10 of 11 checks passed

teknium1 deleted the hermes/hermes-17b001fb branch April 23, 2026 12:18

alt-glitch added type/bug Something isn't working P3 Low — cosmetic, nice to have comp/tools Tool registry, model_tools, toolsets tool/skills Skills system (list, view, manage) labels Apr 23, 2026

teknium1 mentioned this pull request Apr 23, 2026

feat(skills-guard): make agent-created scanner opt-in via config.skills.guard_agent_created #14557

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(skills-guard): let agent-created skills bypass dangerous-verdict confirmation#14538

fix(skills-guard): let agent-created skills bypass dangerous-verdict confirmation#14538
teknium1 merged 1 commit into
mainfrom
hermes/hermes-17b001fb

teknium1 commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented Apr 23, 2026

Summary

Root cause

Concrete trigger

Changes

What still gets blocked

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants