harness

Harness: autonomous vulnerability discovery

This package is the reference pipeline: an autonomous, multi-agent harness for finding, verifying, reporting, and patching memory-safety bugs in C/C++ codebases. It runs Claude Code agents inside gVisor-isolated containers, builds ASAN-instrumented targets, and grades every finding with an executable oracle (the PoC crashes, or it doesn't).

This README is the copy-paste path to a demo. For the architecture, every CLI flag, and rate-limit math, see docs/pipeline.md.

⚠️ run, recon, report, and patch execute target code. The harness refuses to spawn agents outside its gVisor sandbox. Run scripts/setup_sandbox.sh once, then invoke everything through bin/vp-sandboxed. Never mount credentials into the agent environment. See docs/security.md.

Prerequisites

Linux host (x86_64 or aarch64), required by gVisor. On macOS/Windows, run inside a Linux VM.
Docker.
Python 3.11+.
An Anthropic API key or Claude Code OAuth token.

Demo: find real CVEs in dr_libs

The drlibs target scans mackron/dr_libs at a commit with two known CVEs (a heap OOB write in dr_wav.h and an integer-overflow DoS in dr_flac.h). The pipeline finds them from source: no CVE IDs, no hints, no network to the agent. This is the realistic shape of a real engagement: the target's source lives only inside the container, not in this repo. targets/drlibs/Dockerfile fetches dr_wav.h and dr_flac.h from GitHub at build time, pinned to the vulnerable commit, and compiles them with ASAN. Your own target works the same way: a Dockerfile that pulls your code at a pinned commit and builds it instrumented.

Setup (once)

cd <repo-root>
python3 -m venv .venv
.venv/bin/pip install -e .
export ANTHROPIC_API_KEY=sk-ant-...        # or CLAUDE_CODE_OAUTH_TOKEN
export VULN_PIPELINE_MODEL=<model-id>      # Claude Opus recommended; override per-call with --model

# Installs gVisor, builds the target + agent images, verifies isolation; needs sudo.
# This is where the dr_libs source is fetched: the Dockerfile ADDs dr_wav.h and
# dr_flac.h from GitHub at the pinned commit and compiles them with ASAN.
# (Build it directly to see what's inside: docker build -t vuln-pipeline-drlibs:latest targets/drlibs/)
./scripts/setup_sandbox.sh

Run (end to end)

One command runs recon → find → grade → judge → report:

bin/vp-sandboxed run drlibs --auto-focus --runs 3 --parallel --stream
# --auto-focus : run recon first and feed its focus_areas partition to the find agents
# --runs 3 --parallel : 3 concurrent find agents, each in its own container
# --stream : judge + report stream as each grade lands (first report in minutes)
#
# → results/drlibs/<timestamp>/run_NNN/{result.json, poc.bin, find_transcript.jsonl}
#   results/drlibs/<timestamp>/reports/bug_NN/report.json

Then patch the confirmed crashes. This is a separate step on purpose, so you can read the reports and decide what's worth fixing before spending tokens. patch takes a results batch directory, not a target name: each run writes a new results/drlibs/<timestamp>/, so if you've scanned more than once you need to say which batch you're patching (the intended loop is scan → patch → re-scan the patched tree). To patch the batch you just ran, resolve the newest timestamp with shell expansion:

bin/vp-sandboxed patch results/drlibs/$(ls -t results/drlibs | head -1)/
# → resolves to the most recent batch
#   results/drlibs/<timestamp>/reports/bug_NN/{patch.diff, patch_result.json}

Or name the batch explicitly; the run command prints it in its summary (run 0: crash_found → results/drlibs/20260519T.../run_000/result.json):

bin/vp-sandboxed patch results/drlibs/<timestamp>/

The first confirmed crash (the dr_wav heap OOB write) typically lands in ~6 minutes. For the dr_flac integer overflow, add --accept-dos; it's DoS-class and the default quality bar triages it as not-memory-corruption. Full expected-results table and run notes in targets/drlibs/README.md.

Network note. The docker build step in setup_sandbox.sh needs outbound HTTPS to fetch the target source. After that, the find/grade/patch agents run with egress locked to api.anthropic.com; they never see the network. This is the setup → attack isolation split described in docs/security.md.

Run (step by step)

If you'd rather inspect each phase before committing tokens to the next:

# Recon only: read the source, print a focus_areas: YAML block.
# Review it, optionally edit it, paste it into targets/drlibs/config.yaml.
bin/vp-sandboxed recon drlibs

# Find + grade only, using the focus_areas you pasted (no recon, no reports)
bin/vp-sandboxed run drlibs --runs 3 --parallel

# Report after the fact, once all grades land
vuln-pipeline report results/drlibs/<timestamp>/

# Patch
bin/vp-sandboxed patch results/drlibs/<timestamp>/

Watching a run

Each find-agent is a headless claude -p session inside its own container. Tail its transcript as it works:

tail -f results/drlibs/<timestamp>/run_000/find_transcript.jsonl | python3 -c \
  'import sys, json
for line in sys.stdin:
    m = json.loads(line)
    if m.get("type") == "assistant":
        for b in m.get("message", {}).get("content", []):
            if b.get("type") == "tool_use":
                print(f"→ {b['name']}: {str(b.get('input',{}))[:120]}")'

After the run

vuln-pipeline dedup  results/drlibs/<timestamp>/   # group crashes by root-cause signature
vuln-pipeline report results/drlibs/<timestamp>/   # exploitability analysis per unique bug
vuln-pipeline run    drlibs --resume results/drlibs/<timestamp>/   # retry failed/killed runs

Other targets

ls targets/

canary is the synthetic smoke test: planted bugs, ~6 min, full source in the repo (which is why the static skills /threat-model, /vuln-scan, /triage demo on it), and a pre-baked fixture at targets/canary/fixtures/results_sample for trying patch/report without burning find tokens. alsa and htslib are additional real-world CVE demo targets; like drlibs, their source is fetched at Docker build time. Each has its own targets/<name>/README.md.

Port to your stack

The C/C++/ASAN specifics live in prompts/, asan.py, and patch_grade.py:_t1_passes(). The orchestration (cli.py, find.py, grade.py, report.py) is mostly domain-neutral. See docs/customizing.md, or run /customize in Claude Code from the repo root.

Name		Name	Last commit message	Last commit date
parent directory ..
prompts		prompts
README.md		README.md
__init__.py		__init__.py
agent.py		agent.py
agent_image.py		agent_image.py
artifacts.py		artifacts.py
asan.py		asan.py
cli.py		cli.py
config.py		config.py
dedup.py		dedup.py
docker_ops.py		docker_ops.py
find.py		find.py
grade.py		grade.py
judge.py		judge.py
novelty.py		novelty.py
patch.py		patch.py
patch_grade.py		patch_grade.py
recon.py		recon.py
report.py		report.py
sandbox.py		sandbox.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Harness: autonomous vulnerability discovery

Prerequisites

Demo: find real CVEs in dr_libs

Setup (once)

Run (end to end)

Run (step by step)

Watching a run

After the run

Other targets

Port to your stack

FilesExpand file tree

harness

Directory actions

More options

Directory actions

More options

Latest commit

History

harness

Folders and files

parent directory

README.md

Harness: autonomous vulnerability discovery

Prerequisites

Demo: find real CVEs in dr_libs

Setup (once)

Run (end to end)

Run (step by step)

Watching a run

After the run

Other targets

Port to your stack