Skip to content

Latest commit

 

History

History

README.md

Harness: autonomous vulnerability discovery

This package is the reference pipeline: an autonomous, multi-agent harness for finding, verifying, reporting, and patching memory-safety bugs in C/C++ codebases. It runs Claude Code agents inside gVisor-isolated containers, builds ASAN-instrumented targets, and grades every finding with an executable oracle (the PoC crashes, or it doesn't).

This README is the copy-paste path to a demo. For the architecture, every CLI flag, and rate-limit math, see docs/pipeline.md.

⚠️ run, recon, report, and patch execute target code. The harness refuses to spawn agents outside its gVisor sandbox. Run scripts/setup_sandbox.sh once, then invoke everything through bin/vp-sandboxed. Never mount credentials into the agent environment. See docs/security.md.

Prerequisites

  • Linux host (x86_64 or aarch64), required by gVisor. On macOS/Windows, run inside a Linux VM.
  • Docker.
  • Python 3.11+.
  • An Anthropic API key or Claude Code OAuth token.

Demo: find real CVEs in dr_libs

The drlibs target scans mackron/dr_libs at a commit with two known CVEs (a heap OOB write in dr_wav.h and an integer-overflow DoS in dr_flac.h). The pipeline finds them from source: no CVE IDs, no hints, no network to the agent. This is the realistic shape of a real engagement: the target's source lives only inside the container, not in this repo. targets/drlibs/Dockerfile fetches dr_wav.h and dr_flac.h from GitHub at build time, pinned to the vulnerable commit, and compiles them with ASAN. Your own target works the same way: a Dockerfile that pulls your code at a pinned commit and builds it instrumented.

Setup (once)

cd <repo-root>
python3 -m venv .venv
.venv/bin/pip install -e .
export ANTHROPIC_API_KEY=sk-ant-...        # or CLAUDE_CODE_OAUTH_TOKEN
export VULN_PIPELINE_MODEL=<model-id>      # Claude Opus recommended; override per-call with --model

# Installs gVisor, builds the target + agent images, verifies isolation; needs sudo.
# This is where the dr_libs source is fetched: the Dockerfile ADDs dr_wav.h and
# dr_flac.h from GitHub at the pinned commit and compiles them with ASAN.
# (Build it directly to see what's inside: docker build -t vuln-pipeline-drlibs:latest targets/drlibs/)
./scripts/setup_sandbox.sh

Run (end to end)

One command runs recon → find → grade → judge → report:

bin/vp-sandboxed run drlibs --auto-focus --runs 3 --parallel --stream
# --auto-focus : run recon first and feed its focus_areas partition to the find agents
# --runs 3 --parallel : 3 concurrent find agents, each in its own container
# --stream : judge + report stream as each grade lands (first report in minutes)
#
# → results/drlibs/<timestamp>/run_NNN/{result.json, poc.bin, find_transcript.jsonl}
#   results/drlibs/<timestamp>/reports/bug_NN/report.json

Then patch the confirmed crashes. This is a separate step on purpose, so you can read the reports and decide what's worth fixing before spending tokens. patch takes a results batch directory, not a target name: each run writes a new results/drlibs/<timestamp>/, so if you've scanned more than once you need to say which batch you're patching (the intended loop is scan → patch → re-scan the patched tree). To patch the batch you just ran, resolve the newest timestamp with shell expansion:

bin/vp-sandboxed patch results/drlibs/$(ls -t results/drlibs | head -1)/
# → resolves to the most recent batch
#   results/drlibs/<timestamp>/reports/bug_NN/{patch.diff, patch_result.json}

Or name the batch explicitly; the run command prints it in its summary (run 0: crash_found → results/drlibs/20260519T.../run_000/result.json):

bin/vp-sandboxed patch results/drlibs/<timestamp>/

The first confirmed crash (the dr_wav heap OOB write) typically lands in ~6 minutes. For the dr_flac integer overflow, add --accept-dos; it's DoS-class and the default quality bar triages it as not-memory-corruption. Full expected-results table and run notes in targets/drlibs/README.md.

Network note. The docker build step in setup_sandbox.sh needs outbound HTTPS to fetch the target source. After that, the find/grade/patch agents run with egress locked to api.anthropic.com; they never see the network. This is the setup → attack isolation split described in docs/security.md.

Run (step by step)

If you'd rather inspect each phase before committing tokens to the next:

# Recon only: read the source, print a focus_areas: YAML block.
# Review it, optionally edit it, paste it into targets/drlibs/config.yaml.
bin/vp-sandboxed recon drlibs

# Find + grade only, using the focus_areas you pasted (no recon, no reports)
bin/vp-sandboxed run drlibs --runs 3 --parallel

# Report after the fact, once all grades land
vuln-pipeline report results/drlibs/<timestamp>/

# Patch
bin/vp-sandboxed patch results/drlibs/<timestamp>/

Watching a run

Each find-agent is a headless claude -p session inside its own container. Tail its transcript as it works:

tail -f results/drlibs/<timestamp>/run_000/find_transcript.jsonl | python3 -c \
  'import sys, json
for line in sys.stdin:
    m = json.loads(line)
    if m.get("type") == "assistant":
        for b in m.get("message", {}).get("content", []):
            if b.get("type") == "tool_use":
                print(f"→ {b['name']}: {str(b.get('input',{}))[:120]}")'

After the run

vuln-pipeline dedup  results/drlibs/<timestamp>/   # group crashes by root-cause signature
vuln-pipeline report results/drlibs/<timestamp>/   # exploitability analysis per unique bug
vuln-pipeline run    drlibs --resume results/drlibs/<timestamp>/   # retry failed/killed runs

Other targets

ls targets/

canary is the synthetic smoke test: planted bugs, ~6 min, full source in the repo (which is why the static skills /threat-model, /vuln-scan, /triage demo on it), and a pre-baked fixture at targets/canary/fixtures/results_sample for trying patch/report without burning find tokens. alsa and htslib are additional real-world CVE demo targets; like drlibs, their source is fetched at Docker build time. Each has its own targets/<name>/README.md.

Port to your stack

The C/C++/ASAN specifics live in prompts/, asan.py, and patch_grade.py:_t1_passes(). The orchestration (cli.py, find.py, grade.py, report.py) is mostly domain-neutral. See docs/customizing.md, or run /customize in Claude Code from the repo root.