feat(skills): add web-pentest optional skill (closes #400) by teknium1 · Pull Request #32265 · NousResearch/hermes-agent

teknium1 · 2026-05-25T21:39:22Z

Summary

New optional-skills/security/web-pentest/ skill for authorized web app penetration testing.

Adapts Shannon's methodology (No Exploit, No Report; slot-type and render-context taxonomy; bypass-exhaustion-before-FP) as a fresh implementation. AGPL-clean — concepts only, no code borrowed.

Closes #400. Supersedes #21845 (plugin-shaped proposal; skill-shaped is the right footprint here).

What's in the skill

SKILL.md — 5 phases (engagement setup → recon → vuln analysis → proof-based exploitation → report) with hard guardrails
references/scope-enforcement.md — every active request gates on scope.txt; host-extraction rules per tool surface
references/vuln-taxonomy.md — slot types (SQL-val, CMD-argument, PATH-segment, ...) + XSS render contexts; OWASP Top 10 map
references/exploitation-techniques.md — per-class witness payloads; intentionally non-destructive defaults
references/bypass-techniques.md — filter/WAF bypass set per class, consulted before any FP classification
templates/authorization.md — written engagement authorization (target, basis, scope, constraints)
templates/pentest-report.md — final report shape (L3/L4 only, L1/L2 listed as candidates)
templates/exploitation-queue.json — per-class finding queue schema
scripts/recon-scan.sh — scope-bounded recon wrapper (nmap + whatweb + headers + robots/sitemap)

Hermes-specific guardrails

Authorization gate — explicit operator ack before any active scanning
Scope allowlist — scope.txt is the bouncer; the skill teaches the agent to refuse off-scope hosts
Aux-client leakage — payloads/captured creds redacted in chat history because compression + title-gen replay history through aux client (often the main model)
Cloud metadata off by default — 169.254.169.254 / metadata.google.internal etc. require explicit opt-in
Destructive payloads need approval — built on top of approval.py rather than relying on it alone

Validation

Live-pentested the dashboard with this skill running locally. Skill produced a clean engagement directory + findings; recon-scan.sh enforced scope.txt correctly. Findings to be filed as a separate issue (the dashboard's defenses held up — main gaps are unauthed plugin-asset reads + unrestricted PUT /api/env key namespace, both posture issues not active exploits).

Skill loads via existing OptionalSkillSource (no plumbing changes).

Adds optional-skills/security/web-pentest/ — an authorized web app penetration testing skill adapted from Shannon's methodology (concepts only; AGPL-clean fresh implementation). Phased: recon (read-only) → vuln analysis (delegate_task per OWASP class) → proof-based exploitation → report. Guardrails baked in: - Authorization gate before first active scan (templates/authorization.md) - Scope allowlist (scope.txt) consulted by recon-scan.sh and documented as the rule for every active request - Aux-client leakage warning (compression + title gen replay history; payloads/creds must not enter chat verbatim) - Bypass-exhaustion discipline before false-positive classification - L3/L4 (proof-required) for reportable findings; L1/L2 listed as candidates only Closes #400. Supersedes #21845 (plugin-shaped proposal; skill-shaped is cheaper and matches the existing optional-skills/security/ pattern).

github-actions · 2026-05-25T21:40:05Z

🔎 Lint report: `hermes/hermes-24c9f20a` vs `origin/main`

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9347 on HEAD, 9347 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4946 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

Adds optional-skills/security/web-pentest/ — an authorized web app penetration testing skill adapted from Shannon's methodology (concepts only; AGPL-clean fresh implementation). Phased: recon (read-only) → vuln analysis (delegate_task per OWASP class) → proof-based exploitation → report. Guardrails baked in: - Authorization gate before first active scan (templates/authorization.md) - Scope allowlist (scope.txt) consulted by recon-scan.sh and documented as the rule for every active request - Aux-client leakage warning (compression + title gen replay history; payloads/creds must not enter chat verbatim) - Bypass-exhaustion discipline before false-positive classification - L3/L4 (proof-required) for reportable findings; L1/L2 listed as candidates only Closes NousResearch#400. Supersedes NousResearch#21845 (plugin-shaped proposal; skill-shaped is cheaper and matches the existing optional-skills/security/ pattern).

Adds optional-skills/security/web-pentest/ — an authorized web app penetration testing skill adapted from Shannon's methodology (concepts only; AGPL-clean fresh implementation). Phased: recon (read-only) → vuln analysis (delegate_task per OWASP class) → proof-based exploitation → report. Guardrails baked in: - Authorization gate before first active scan (templates/authorization.md) - Scope allowlist (scope.txt) consulted by recon-scan.sh and documented as the rule for every active request - Aux-client leakage warning (compression + title gen replay history; payloads/creds must not enter chat verbatim) - Bypass-exhaustion discipline before false-positive classification - L3/L4 (proof-required) for reportable findings; L1/L2 listed as candidates only Closes NousResearch#400. Supersedes NousResearch#21845 (plugin-shaped proposal; skill-shaped is cheaper and matches the existing optional-skills/security/ pattern). #AI commit#

Adds optional-skills/security/web-pentest/ — an authorized web app penetration testing skill adapted from Shannon's methodology (concepts only; AGPL-clean fresh implementation). Phased: recon (read-only) → vuln analysis (delegate_task per OWASP class) → proof-based exploitation → report. Guardrails baked in: - Authorization gate before first active scan (templates/authorization.md) - Scope allowlist (scope.txt) consulted by recon-scan.sh and documented as the rule for every active request - Aux-client leakage warning (compression + title gen replay history; payloads/creds must not enter chat verbatim) - Bypass-exhaustion discipline before false-positive classification - L3/L4 (proof-required) for reportable findings; L1/L2 listed as candidates only Closes NousResearch#400. Supersedes NousResearch#21845 (plugin-shaped proposal; skill-shaped is cheaper and matches the existing optional-skills/security/ pattern).

teknium1 mentioned this pull request May 25, 2026

feat: add pentest ops Hermes plugin layer #21845

Closed

teknium1 mentioned this pull request May 25, 2026

Dashboard hardening: auth gate on /dashboard-plugins/ + writable env-key allowlist #32267

Closed

teknium1 merged commit 263e008 into main May 25, 2026
19 checks passed

teknium1 deleted the hermes/hermes-24c9f20a branch May 25, 2026 21:51

alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have tool/skills Skills system (list, view, manage) labels May 25, 2026

r266-tech mentioned this pull request May 25, 2026

docs: register web-pentest skill in optional-skills catalog + sidebar #32278

Open

BrewTestBot mentioned this pull request May 28, 2026

hermes-agent 2026.5.28 Homebrew/homebrew-core#285115

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(skills): add web-pentest optional skill (closes #400)#32265

feat(skills): add web-pentest optional skill (closes #400)#32265
teknium1 merged 1 commit into
mainfrom
hermes/hermes-24c9f20a

teknium1 commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented May 25, 2026

Summary

What's in the skill

Hermes-specific guardrails

Validation

Uh oh!

github-actions Bot commented May 25, 2026

🔎 Lint report: hermes/hermes-24c9f20a vs origin/main

ruff

ty (type checker)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

🔎 Lint report: `hermes/hermes-24c9f20a` vs `origin/main`