Skip to content

Feature: Code Security Audit Skill — SAST Scanning, Vulnerability Validation, and Automated Patching (inspired by RAPTOR) #382

@teknium1

Description

@teknium1

Overview

RAPTOR (Recursive Autonomous Penetration Testing and Observation Robot) is an MIT-licensed cybersecurity agent framework by Gadi Evron, Daniel Cuthbert, Thomas Dullien (Halvar Flake), and Michael Bargury. It transforms Claude Code into an autonomous offensive/defensive security research agent, integrating Semgrep, CodeQL, AFL++ fuzzing, and LLM-powered analysis into a multi-phase vulnerability discovery and remediation pipeline.

Hermes Agent currently has zero offensive cybersecurity capabilities. Its security features are entirely defensive (secret redaction, dangerous command approval, prompt injection detection, container hardening). The closest security-adjacent skill is domain-intel (passive DNS/WHOIS/SSL reconnaissance). There are no skills for code security scanning, vulnerability detection, or automated remediation — a massive capability gap.

This issue proposes a Code Security Audit skill that adapts RAPTOR's core pipeline — static analysis → exploitability validation → LLM-powered analysis → patch generation — into a Hermes Agent skill. This would give Hermes the ability to scan codebases for security vulnerabilities, validate findings to reduce false positives, explain vulnerabilities in plain language, and generate secure patches.


Research Findings

How RAPTOR's Code Security Pipeline Works

RAPTOR's agentic workflow (raptor_agentic.py, 713 lines) orchestrates a multi-phase pipeline:

Phase 1: Code Scanning (Semgrep + CodeQL in parallel)

  • Runs Semgrep with 17 custom rule files organized by category: crypto (7 rules: weak-hash, weak-cipher, weak-kdf, reused-nonce, insecure-iv, weak-block-modes, weak-asym-keysize), injection (2 rules: command-taint with taint tracking, sql-concat), plus auth, deserialization, filesystem, flows, logging, secrets, sinks categories
  • Uses Semgrep's taint mode for dataflow-aware detection (tracks data from sources like request.GET, flask.request.args, input(), os.environ to sinks like subprocess.run(shell=True), os.system, with sanitizers like shlex.quote)
  • Baseline Semgrep packs: security-audit, owasp-top-ten, secrets
  • Optionally runs CodeQL for deeper taint/dataflow analysis with database creation and custom queries
  • All output in SARIF 2.1.0 format

Phase 2: Exploitability Validation (6-stage pipeline, 1007 lines in orchestrator.py)

  • Stage 0 (INVENTORY): Build function checklist from source code — language-aware function extraction supporting Python (AST), JS/TS, C/C++, Java, Go, Rust, Ruby, PHP
  • Stage A (ONESHOT): Quick exploitability check with mandatory disproved_because fields (can't dismiss without explanation)
  • Stage B (PROCESS): Systematic attack tree analysis with 5 working documents (attack-tree.json, hypotheses.json, disproven.json, attack-paths.json, attack-surface.json) and PROXIMITY tracking (0-10 scale). Terminates after 5 failed attempts unless proximity improved.
  • Stage C (SANITY): Verbatim code verification and reachability checks
  • Stage D (RULING): Filter test code, dead code, hedging language
  • Stage E (FEASIBILITY): Binary constraint analysis for memory corruption vulnerabilities

Phase 3: LLM-Powered Analysis

  • Uses the validated findings as input
  • LLM reads actual source code, understands context, and provides detailed vulnerability explanations
  • Generates compilable PoC exploit code for verified vulnerabilities
  • Multi-turn refinement of exploits

Phase 4: Patch Generation

  • Generates secure fix proposals alongside exploit PoCs
  • Patches are designed to be minimal and targeted

Key Prompting Patterns (from RAPTOR's .claude/skills/)

RAPTOR's exploitability validation skill uses 6 "MUST-GATES" — behavioral constraints that override default LLM tendencies:

  1. GATE-1 [ASSUME-EXPLOIT]: Investigate AS IF exploitable until proven otherwise (prevents premature dismissal)
  2. GATE-2 [STRICT-SEQUENCE]: Follow instructions exactly; present additional ideas separately
  3. GATE-3 [CHECKLIST]: Track compliance, update after every action
  4. GATE-4 [NO-HEDGING]: If chain-of-thought includes "if/maybe/uncertain", IMMEDIATELY verify
  5. GATE-5 [FULL-COVERAGE]: Check ALL code, no sampling
  6. GATE-6 [PROOF]: Show vulnerable code snippet for every finding

The /validate command (284 lines) explicitly tells the agent: "You ARE the LLM for this pipeline. Don't just run tools and expect results — you must perform the analysis work."

Vulnerability Coverage

RAPTOR's rules cover: SQL injection, XSS, CSRF, auth bypass, command injection, path traversal, SSRF, XXE, SSTI, IDOR, buffer overflows, format string bugs, use-after-free, weak crypto, hardcoded secrets, insecure deserialization.

Languages supported: Python, JavaScript/TypeScript, Java, Go, C/C++, Ruby, PHP, Rust.


Current State in Hermes Agent

What we have:

  • domain-intel skill — passive DNS/WHOIS/SSL reconnaissance (the only security-adjacent skill)
  • tools/approval.py — dangerous command detection (defensive, for agent safety)
  • agent/redact.py — secret redaction in logs/output
  • prompt_builder.py — prompt injection detection in AGENTS.md files
  • No code scanning, no vulnerability detection, no security auditing skills

What we don't have:

  • No SAST/DAST scanning capabilities
  • No Semgrep or CodeQL integration
  • No vulnerability validation or false positive reduction
  • No automated patch generation for security issues
  • No security-focused code review workflow
  • No "security" skill category at all

Relevant existing issues:


Implementation Plan

Skill vs. Tool Classification

This should be a skill because:

  • The entire capability wraps external CLI tools (Semgrep, optionally CodeQL) callable via terminal
  • SARIF output is JSON, parseable by the agent or via execute_code
  • Vulnerability analysis uses the agent's native LLM (no custom Python LLM integration needed)
  • Patch generation is LLM-driven, handled naturally by the agent
  • No binary data, streaming, or real-time events
  • No API key management needed in the agent harness

Bundled vs. Skills Hub: Recommend Skills Hub initially. While code security is broadly useful, it requires installing Semgrep (and optionally CodeQL), which adds setup friction. Could be promoted to bundled later if Semgrep becomes commonly available. The skill should gracefully handle missing tools (suggest installation, work with whatever's available).

Category: Create new security category for this and related security skills.

What We'd Need

  1. SKILL.md — Workflow instructions with trigger conditions, tool detection, analysis prompts adapted from RAPTOR's MUST-GATES
  2. references/semgrep-rules.md — Documentation of available rule categories and what they detect
  3. scripts/custom-rules/ — Adapted subset of RAPTOR's 17 custom Semgrep rules (YAML, MIT-licensed)
  4. references/vulnerability-guide.md — Agent reference for understanding and explaining each vulnerability type
  5. templates/security-report.md — Structured vulnerability report template

Phased Rollout

Phase 1: Semgrep Scanning + LLM Analysis

  • Skill that detects/installs Semgrep, scans a target repo/directory
  • Runs built-in Semgrep packs (security-audit, owasp-top-ten, secrets)
  • Parses SARIF output, presents findings with severity, file locations, code snippets
  • Agent analyzes each finding for true/false positive assessment
  • Generates plain-language vulnerability explanations
  • Suggests secure patches for confirmed vulnerabilities
  • Structured report output (Markdown)

Phase 2: Deep Analysis + Custom Rules

  • Add RAPTOR-inspired custom Semgrep rules (crypto, injection, auth, deserialization)
  • Implement exploitability validation prompts (adapted MUST-GATES)
  • Add hypothesis-testing approach for ambiguous findings
  • Support targeted scanning (specific vulnerability types, specific files)
  • CodeQL integration for taint/dataflow analysis (optional, when CodeQL CLI is available)
  • SARIF file management (save, compare between runs, track remediation)

Phase 3: Automated Remediation Pipeline

  • Full scan → validate → analyze → patch → verify loop
  • Integration with delegate_task for parallel scanning of large codebases
  • Git-aware patching (create branches, stage fixes, generate commit messages)
  • Integration with github-pr-workflow skill for automated security fix PRs
  • Differential scanning (only scan changed files, compare with baseline)
  • CI/CD integration guidance (pre-commit hooks, GitHub Actions templates)

Pros & Cons

Pros

  • Fills the biggest capability gap — Zero to powerful security scanning in one skill
  • Broadly applicable — Every developer benefits from security scanning
  • Battle-tested patterns — RAPTOR's prompting patterns are designed by elite security researchers (Halvar Flake et al.)
  • MIT-licensed source material — RAPTOR is MIT, safe to adapt rules and patterns
  • Leverages existing tools — Semgrep is industry standard, well-maintained, free for CLI use
  • Dual value — Both finds vulnerabilities AND generates fixes (the "detect + repair" loop)
  • Language-agnostic — Semgrep supports 30+ languages out of the box
  • SARIF standard — Interoperable with other security tools and CI/CD systems

Cons / Risks

  • Semgrep installation required — 200MB+ download, Python-based. Skill should handle graceful degradation.
  • CodeQL restrictions — GitHub's CodeQL has license restrictions on commercial use. Should be clearly marked as optional.
  • False positive management — Even with validation, static analysis produces noise. The skill needs clear guidance on confidence levels.
  • Security responsibility — Users may over-trust automated findings. Skill should include disclaimers about manual verification.
  • Scope discipline — Code scanning is a deep field; must resist feature creep toward full AppSec platform.

Open Questions

  1. Should the skill include RAPTOR's full set of 17 custom Semgrep rules, or start with a curated subset?
  2. Should CodeQL support be in Phase 1 (more powerful but harder to install) or deferred to Phase 2?
  3. How should the skill handle very large codebases? Scan everything or ask the user to scope?
  4. Should vulnerability findings be stored persistently (e.g., in a workspace file) for tracking remediation over time?
  5. Should the skill integrate with the github-code-review skill for security-focused PR reviews?

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions