Skip to content

feat(skills): add web-pentest optional skill (closes #400)#32265

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-24c9f20a
May 25, 2026
Merged

feat(skills): add web-pentest optional skill (closes #400)#32265
teknium1 merged 1 commit into
mainfrom
hermes/hermes-24c9f20a

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Summary

New optional-skills/security/web-pentest/ skill for authorized web app penetration testing.

Adapts Shannon's methodology (No Exploit, No Report; slot-type and render-context taxonomy; bypass-exhaustion-before-FP) as a fresh implementation. AGPL-clean — concepts only, no code borrowed.

Closes #400. Supersedes #21845 (plugin-shaped proposal; skill-shaped is the right footprint here).

What's in the skill

  • SKILL.md — 5 phases (engagement setup → recon → vuln analysis → proof-based exploitation → report) with hard guardrails
  • references/scope-enforcement.md — every active request gates on scope.txt; host-extraction rules per tool surface
  • references/vuln-taxonomy.md — slot types (SQL-val, CMD-argument, PATH-segment, ...) + XSS render contexts; OWASP Top 10 map
  • references/exploitation-techniques.md — per-class witness payloads; intentionally non-destructive defaults
  • references/bypass-techniques.md — filter/WAF bypass set per class, consulted before any FP classification
  • templates/authorization.md — written engagement authorization (target, basis, scope, constraints)
  • templates/pentest-report.md — final report shape (L3/L4 only, L1/L2 listed as candidates)
  • templates/exploitation-queue.json — per-class finding queue schema
  • scripts/recon-scan.sh — scope-bounded recon wrapper (nmap + whatweb + headers + robots/sitemap)

Hermes-specific guardrails

  • Authorization gate — explicit operator ack before any active scanning
  • Scope allowlistscope.txt is the bouncer; the skill teaches the agent to refuse off-scope hosts
  • Aux-client leakage — payloads/captured creds redacted in chat history because compression + title-gen replay history through aux client (often the main model)
  • Cloud metadata off by default169.254.169.254 / metadata.google.internal etc. require explicit opt-in
  • Destructive payloads need approval — built on top of approval.py rather than relying on it alone

Validation

Live-pentested the dashboard with this skill running locally. Skill produced a clean engagement directory + findings; recon-scan.sh enforced scope.txt correctly. Findings to be filed as a separate issue (the dashboard's defenses held up — main gaps are unauthed plugin-asset reads + unrestricted PUT /api/env key namespace, both posture issues not active exploits).

Skill loads via existing OptionalSkillSource (no plumbing changes).

Adds optional-skills/security/web-pentest/ — an authorized web app
penetration testing skill adapted from Shannon's methodology (concepts
only; AGPL-clean fresh implementation).

Phased: recon (read-only) → vuln analysis (delegate_task per OWASP
class) → proof-based exploitation → report.

Guardrails baked in:
- Authorization gate before first active scan (templates/authorization.md)
- Scope allowlist (scope.txt) consulted by recon-scan.sh and
  documented as the rule for every active request
- Aux-client leakage warning (compression + title gen replay history;
  payloads/creds must not enter chat verbatim)
- Bypass-exhaustion discipline before false-positive classification
- L3/L4 (proof-required) for reportable findings; L1/L2 listed as
  candidates only

Closes #400. Supersedes #21845 (plugin-shaped proposal; skill-shaped is
cheaper and matches the existing optional-skills/security/ pattern).
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-24c9f20a vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9347 on HEAD, 9347 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4946 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@teknium1 teknium1 merged commit 263e008 into main May 25, 2026
19 checks passed
@teknium1 teknium1 deleted the hermes/hermes-24c9f20a branch May 25, 2026 21:51
@alt-glitch alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have tool/skills Skills system (list, view, manage) labels May 25, 2026
daletkc pushed a commit to daletkc/hermes-agent that referenced this pull request May 25, 2026
Adds optional-skills/security/web-pentest/ — an authorized web app
penetration testing skill adapted from Shannon's methodology (concepts
only; AGPL-clean fresh implementation).

Phased: recon (read-only) → vuln analysis (delegate_task per OWASP
class) → proof-based exploitation → report.

Guardrails baked in:
- Authorization gate before first active scan (templates/authorization.md)
- Scope allowlist (scope.txt) consulted by recon-scan.sh and
  documented as the rule for every active request
- Aux-client leakage warning (compression + title gen replay history;
  payloads/creds must not enter chat verbatim)
- Bypass-exhaustion discipline before false-positive classification
- L3/L4 (proof-required) for reportable findings; L1/L2 listed as
  candidates only

Closes NousResearch#400. Supersedes NousResearch#21845 (plugin-shaped proposal; skill-shaped is
cheaper and matches the existing optional-skills/security/ pattern).
bridge25 pushed a commit to bridge25/hermes-agent that referenced this pull request May 27, 2026
Adds optional-skills/security/web-pentest/ — an authorized web app
penetration testing skill adapted from Shannon's methodology (concepts
only; AGPL-clean fresh implementation).

Phased: recon (read-only) → vuln analysis (delegate_task per OWASP
class) → proof-based exploitation → report.

Guardrails baked in:
- Authorization gate before first active scan (templates/authorization.md)
- Scope allowlist (scope.txt) consulted by recon-scan.sh and
  documented as the rule for every active request
- Aux-client leakage warning (compression + title gen replay history;
  payloads/creds must not enter chat verbatim)
- Bypass-exhaustion discipline before false-positive classification
- L3/L4 (proof-required) for reportable findings; L1/L2 listed as
  candidates only

Closes NousResearch#400. Supersedes NousResearch#21845 (plugin-shaped proposal; skill-shaped is
cheaper and matches the existing optional-skills/security/ pattern).
mathias3 pushed a commit to mathias3/hermes-agent that referenced this pull request May 28, 2026
Adds optional-skills/security/web-pentest/ — an authorized web app
penetration testing skill adapted from Shannon's methodology (concepts
only; AGPL-clean fresh implementation).

Phased: recon (read-only) → vuln analysis (delegate_task per OWASP
class) → proof-based exploitation → report.

Guardrails baked in:
- Authorization gate before first active scan (templates/authorization.md)
- Scope allowlist (scope.txt) consulted by recon-scan.sh and
  documented as the rule for every active request
- Aux-client leakage warning (compression + title gen replay history;
  payloads/creds must not enter chat verbatim)
- Bypass-exhaustion discipline before false-positive classification
- L3/L4 (proof-required) for reportable findings; L1/L2 listed as
  candidates only

Closes NousResearch#400. Supersedes NousResearch#21845 (plugin-shaped proposal; skill-shaped is
cheaper and matches the existing optional-skills/security/ pattern).
Bryce-huang pushed a commit to wbkunlun/hermes-agent that referenced this pull request May 29, 2026
Adds optional-skills/security/web-pentest/ — an authorized web app
penetration testing skill adapted from Shannon's methodology (concepts
only; AGPL-clean fresh implementation).

Phased: recon (read-only) → vuln analysis (delegate_task per OWASP
class) → proof-based exploitation → report.

Guardrails baked in:
- Authorization gate before first active scan (templates/authorization.md)
- Scope allowlist (scope.txt) consulted by recon-scan.sh and
  documented as the rule for every active request
- Aux-client leakage warning (compression + title gen replay history;
  payloads/creds must not enter chat verbatim)
- Bypass-exhaustion discipline before false-positive classification
- L3/L4 (proof-required) for reportable findings; L1/L2 listed as
  candidates only

Closes NousResearch#400. Supersedes NousResearch#21845 (plugin-shaped proposal; skill-shaped is
cheaper and matches the existing optional-skills/security/ pattern).
#AI commit#
mosaiq-systems pushed a commit to mosaiq-systems/hermes-agent that referenced this pull request May 29, 2026
Adds optional-skills/security/web-pentest/ — an authorized web app
penetration testing skill adapted from Shannon's methodology (concepts
only; AGPL-clean fresh implementation).

Phased: recon (read-only) → vuln analysis (delegate_task per OWASP
class) → proof-based exploitation → report.

Guardrails baked in:
- Authorization gate before first active scan (templates/authorization.md)
- Scope allowlist (scope.txt) consulted by recon-scan.sh and
  documented as the rule for every active request
- Aux-client leakage warning (compression + title gen replay history;
  payloads/creds must not enter chat verbatim)
- Bypass-exhaustion discipline before false-positive classification
- L3/L4 (proof-required) for reportable findings; L1/L2 listed as
  candidates only

Closes NousResearch#400. Supersedes NousResearch#21845 (plugin-shaped proposal; skill-shaped is
cheaper and matches the existing optional-skills/security/ pattern).
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
Adds optional-skills/security/web-pentest/ — an authorized web app
penetration testing skill adapted from Shannon's methodology (concepts
only; AGPL-clean fresh implementation).

Phased: recon (read-only) → vuln analysis (delegate_task per OWASP
class) → proof-based exploitation → report.

Guardrails baked in:
- Authorization gate before first active scan (templates/authorization.md)
- Scope allowlist (scope.txt) consulted by recon-scan.sh and
  documented as the rule for every active request
- Aux-client leakage warning (compression + title gen replay history;
  payloads/creds must not enter chat verbatim)
- Bypass-exhaustion discipline before false-positive classification
- L3/L4 (proof-required) for reportable findings; L1/L2 listed as
  candidates only

Closes NousResearch#400. Supersedes NousResearch#21845 (plugin-shaped proposal; skill-shaped is
cheaper and matches the existing optional-skills/security/ pattern).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have tool/skills Skills system (list, view, manage) type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Web Application Penetration Testing Skill — Reconnaissance, Exploitation, and Proof-Based Reporting (inspired by Shannon)

2 participants