feat: add pentest ops Hermes plugin layer by LiterallyBlah · Pull Request #21845 · NousResearch/hermes-agent

LiterallyBlah · 2026-05-08T12:06:59Z

Summary

Adds a bundled opt-in pentest-ops plugin that keeps Hermes' core loop unchanged while exposing a pentest operating layer.
Registers pentest_ops_status and, when recon_graph_agent.hermes_plugin.tools is importable, forwards Recon Graph backend tools into the recon_graph toolset.
Adds nine namespaced plugin skills for evidence-first pentest orchestration, data-flow modelling, web/API/auth/infrastructure testing, and finding validation.
Documents setup and usage in the guides and built-in plugin docs.

Safety / design notes

No local target execution is introduced.
Missing or malformed Recon Graph backend metadata fails open to a diagnostic status tool rather than breaking plugin startup.
Findings guidance enforces evidence refs, positive/control proof, and approval_ref for promotion/demotion.

Test plan

uv run --extra dev python -m compileall plugins/pentest-ops tests/plugins/test_pentest_ops_plugin.py -q
uv run --extra dev pytest tests/plugins/test_pentest_ops_plugin.py tests/hermes_cli/test_plugins.py::TestPluginContext::test_register_tool_adds_to_registry tests/test_plugin_skills.py tests/test_toolsets.py tests/test_packaging_metadata.py -q
uv run --extra dev ruff check plugins/pentest-ops tests/plugins/test_pentest_ops_plugin.py
git diff --check -- plugins/pentest-ops tests/plugins/test_pentest_ops_plugin.py website/docs/guides/pentest-ops-layer.md website/docs/user-guide/features/built-in-plugins.md website/sidebars.ts
Smoke: plugin enabled with local recon-graph-agent on PYTHONPATH registers 19 recon_graph tools and 9 plugin skills.

Known unrelated validation notes

A broader tests/plugins ... run surfaced existing/unrelated issues:

tests/plugins/test_kanban_dashboard_plugin.py requires missing fastapi.
tests/plugins/test_achievements_plugin.py::test_evaluate_all_stale_cache_serves_stale_and_refreshes_in_background failed once in stale-cache timing/state.

teknium1 · 2026-05-25T21:39:38Z

Hey @LiterallyBlah — closing this in favor of #32265 (skill-shaped instead of plugin-shaped), and thank you for the work that pushed us to actually scope this out.

Reasoning for the different shape: the existing optional-skills/security/ directory already has the pattern (oss-forensics, sherlock, 1password), and the entire pentest methodology in #400 is methodology + prompts + scripts — all of which a skill expresses more cleanly than a plugin. Plugin tools would add toolset bloat that every conversation pays for; the skill is zero-footprint until the user types a pentest trigger. The recon_graph tool forwarding in your PR also presumes an external recon-graph-agent Python package that isn't a published dependency we ship — that's a real coupling problem for users.

Your phasing intuition was right and survived into #32265 — separate recon, vuln analysis, exploitation, reporting phases with proof-required reporting and approval-ref enforcement on finding promotion. The "evidence refs, positive/control proof" rule from your plugin skills is in references/exploitation-techniques.md and the report template.

If you want to credit-restore: the 9 plugin skills in this PR mention several patterns we could adapt back into the optional-skills version (e.g. an auth-vuln 9-point checklist split into its own reference file). Happy to take a follow-up PR from you against optional-skills/security/web-pentest/references/ if you're interested.

Closes #21845.

Adds optional-skills/security/web-pentest/ — an authorized web app penetration testing skill adapted from Shannon's methodology (concepts only; AGPL-clean fresh implementation). Phased: recon (read-only) → vuln analysis (delegate_task per OWASP class) → proof-based exploitation → report. Guardrails baked in: - Authorization gate before first active scan (templates/authorization.md) - Scope allowlist (scope.txt) consulted by recon-scan.sh and documented as the rule for every active request - Aux-client leakage warning (compression + title gen replay history; payloads/creds must not enter chat verbatim) - Bypass-exhaustion discipline before false-positive classification - L3/L4 (proof-required) for reportable findings; L1/L2 listed as candidates only Closes #400. Supersedes #21845 (plugin-shaped proposal; skill-shaped is cheaper and matches the existing optional-skills/security/ pattern).

Adds optional-skills/security/web-pentest/ — an authorized web app penetration testing skill adapted from Shannon's methodology (concepts only; AGPL-clean fresh implementation). Phased: recon (read-only) → vuln analysis (delegate_task per OWASP class) → proof-based exploitation → report. Guardrails baked in: - Authorization gate before first active scan (templates/authorization.md) - Scope allowlist (scope.txt) consulted by recon-scan.sh and documented as the rule for every active request - Aux-client leakage warning (compression + title gen replay history; payloads/creds must not enter chat verbatim) - Bypass-exhaustion discipline before false-positive classification - L3/L4 (proof-required) for reportable findings; L1/L2 listed as candidates only Closes NousResearch#400. Supersedes NousResearch#21845 (plugin-shaped proposal; skill-shaped is cheaper and matches the existing optional-skills/security/ pattern).

Adds optional-skills/security/web-pentest/ — an authorized web app penetration testing skill adapted from Shannon's methodology (concepts only; AGPL-clean fresh implementation). Phased: recon (read-only) → vuln analysis (delegate_task per OWASP class) → proof-based exploitation → report. Guardrails baked in: - Authorization gate before first active scan (templates/authorization.md) - Scope allowlist (scope.txt) consulted by recon-scan.sh and documented as the rule for every active request - Aux-client leakage warning (compression + title gen replay history; payloads/creds must not enter chat verbatim) - Bypass-exhaustion discipline before false-positive classification - L3/L4 (proof-required) for reportable findings; L1/L2 listed as candidates only Closes NousResearch#400. Supersedes NousResearch#21845 (plugin-shaped proposal; skill-shaped is cheaper and matches the existing optional-skills/security/ pattern). #AI commit#

Adds optional-skills/security/web-pentest/ — an authorized web app penetration testing skill adapted from Shannon's methodology (concepts only; AGPL-clean fresh implementation). Phased: recon (read-only) → vuln analysis (delegate_task per OWASP class) → proof-based exploitation → report. Guardrails baked in: - Authorization gate before first active scan (templates/authorization.md) - Scope allowlist (scope.txt) consulted by recon-scan.sh and documented as the rule for every active request - Aux-client leakage warning (compression + title gen replay history; payloads/creds must not enter chat verbatim) - Bypass-exhaustion discipline before false-positive classification - L3/L4 (proof-required) for reportable findings; L1/L2 listed as candidates only Closes NousResearch#400. Supersedes NousResearch#21845 (plugin-shaped proposal; skill-shaped is cheaper and matches the existing optional-skills/security/ pattern).

feat: add pentest ops Hermes plugin layer

2b30768

alt-glitch added type/feature New feature or request comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have labels May 11, 2026

teknium1 mentioned this pull request May 25, 2026

feat(skills): add web-pentest optional skill (closes #400) #32265

Merged

teknium1 closed this May 25, 2026

LiterallyBlah deleted the feat/pentest-ops-hermes-layer branch May 25, 2026 22:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add pentest ops Hermes plugin layer#21845

feat: add pentest ops Hermes plugin layer#21845
LiterallyBlah wants to merge 1 commit into
NousResearch:mainfrom
LiterallyBlah:feat/pentest-ops-hermes-layer

LiterallyBlah commented May 8, 2026

Uh oh!

teknium1 commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

LiterallyBlah commented May 8, 2026

Summary

Safety / design notes

Test plan

Known unrelated validation notes

Uh oh!

teknium1 commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants