You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
RAPTOR includes an elaborate OSS Security Forensics system — a multi-agent investigation framework for researching open-source supply chain attacks. It uses GitHub Archive (BigQuery), Wayback Machine, GitHub API, local git analysis, and IOC extraction to reconstruct attack timelines, recover deleted evidence, and produce forensic investigation reports. This is RAPTOR's most sophisticated module, spanning 9 specialized agent definitions, 5 skill files (1800+ lines combined), and a complete Pydantic v2 evidence schema system.
Supply chain attacks are one of the most critical threats in modern software development (SolarWinds, XZ Utils, event-stream, ua-parser-js). Hermes Agent currently has no capability for investigating suspicious activity in open-source repositories, recovering deleted content, or conducting structured forensic analysis. The existing OSINT skill (#355) covers public records investigation (campaign finance, government contracts) — a completely different domain.
This issue proposes an OSS Security Forensics skill that adapts RAPTOR's investigation framework into a Hermes Agent skill, enabling users to investigate suspicious GitHub activity, recover deleted commits and PRs, analyze supply chain compromise indicators, and produce structured forensic reports with evidence chains.
Research Findings
How RAPTOR's OSS Forensics Works
Architecture: Multi-Agent Investigation Framework
RAPTOR defines 9 specialized agents that work together in a 7-phase orchestration:
GH Archive Investigator — Queries tamper-proof GitHub event history via BigQuery. Covers all 12 GitHub event types (PushEvent, PullRequestEvent, IssuesEvent, etc.). Detects force pushes via zero-commit PushEvents, automation abuse via WorkflowRunEvents, and deleted content that remains in the immutable archive.
GitHub API Investigator — Queries current GitHub API for commits, issues, PRs, files, branches, tags, releases, forks. Cross-references with archive data to detect discrepancies (content present in archive but missing from current API = deletion detected).
Wayback Machine Investigator — Uses CDX API (web.archive.org/cdx/search/cdx) to search archived snapshots of GitHub pages. Recovers deleted READMEs, issues, PRs, wiki pages, release notes, and fork network pages.
Local Git Investigator — Analyzes the local git repository including dangling commits (force-pushed but not garbage-collected), reflog, blame, and diff analysis.
IOC Extractor — Extracts Indicators of Compromise from vendor security reports: commit SHAs, file paths, API keys, secrets, IP addresses, domains, package names, actor usernames.
Phase 3: Hypothesis Formation — An oss-hypothesis-former-agent synthesizes evidence from all 5 investigators into testable hypotheses. Each hypothesis must cite evidence by ID. Follow-up evidence requests trigger additional collection rounds (max N iterations).
Phase 4: Evidence Verification — An oss-evidence-verifier-agent validates each piece of evidence against original sources using a ConsistencyVerifier.
Phase 5: Hypothesis Validation — An oss-hypothesis-checker-agent mechanically validates hypotheses:
Rejects hypotheses with uncited or unverified evidence
Checks for logical consistency
Produces rebuttals that feed back into hypothesis refinement (max iterations enforced)
Phase 6: Report Generation — An oss-report-generator-agent produces the final forensic report.
Phase 7: Completion
GitHub Archive Skill (958 lines)
The most detailed skill file. Key capabilities:
BigQuery cost optimization: Dry runs before every query, column selection (only query needed fields), date range narrowing, _TABLE_SUFFIX filtering
Safe query template: safe_gharchive_query() wrapper with automatic cost controls
Git fetch: git fetch origin <sha> (works even after force push)
Real-world example: Istio supply chain — recovered a PAT token from a force-pushed commit (worth $25k bounty).
Anti-Hallucination Patterns
"STAY IN YOUR LANE": Every agent has explicit role boundaries. E.g., "You are a SPECIALIST INVESTIGATOR for GH Archive BigQuery collection ONLY. You do NOT query GitHub API, recover deleted content, or perform local git forensics."
Evidence-first: Every claim must cite evidence by ID. No assertion without citation.
All data sources are accessible via existing tools: web_extract (Wayback Machine, GitHub pages), terminal (git commands, curl for GitHub API, BigQuery CLI), execute_code (Python data processing)
Evidence management is file-based (JSON evidence stores)
The investigation methodology is instructions + reasoning — exactly what skills are for
Multi-agent orchestration maps naturally to delegate_task sub-agents
No custom Python integration needed in the agent harness
No binary data, streaming, or real-time events
Bundled vs. Skills Hub: Recommend Skills Hub. OSS forensics is specialized (security teams, incident responders, open-source maintainers). Full capability requires BigQuery credentials for GitHub Archive access.
Category:security (same category as Code Security Audit and Binary Security Analysis skills)
What We'd Need
SKILL.md — Investigation workflow with trigger conditions, phase descriptions, and agent orchestration instructions
references/github-archive-guide.md — How to query GH Archive via BigQuery (adapted from RAPTOR's 958-line skill, with cost optimization guidance)
references/evidence-types.md — Event types, observation types, IOC types, and evidence source taxonomy
references/recovery-techniques.md — How to recover deleted commits, PRs, issues from GitHub and Wayback Machine
BigQuery costs — GitHub Archive queries cost money ($6.25/TiB). Skill must include cost controls and dry runs.
Google Cloud dependency — Full GH Archive access requires GCP credentials
Investigation quality — Forensic work requires rigorous reasoning; LLMs may draw false conclusions. Anti-hallucination patterns are essential.
Scope creep — Forensic investigation is deep; must stay focused on GitHub/OSS domain
Ethical considerations — Investigation tools could be misused for harassment or stalking. Skill should include ethical use guidelines.
API rate limits — GitHub API has rate limits (5000/hour authenticated). Investigation of large repos needs throttling.
Open Questions
Should BigQuery integration be required for Phase 1, or can we defer it and still deliver useful forensic capability with just git + GitHub API + Wayback?
How should the evidence store be structured? Simple JSON file per investigation, or something more robust?
Should the hypothesis formation/validation use delegate_task for separate "hypothesis former" and "hypothesis checker" sub-agents, or handle it in a single conversation?
Should the skill include proactive monitoring (scheduled scans of watched repos for suspicious activity)?
Overview
RAPTOR includes an elaborate OSS Security Forensics system — a multi-agent investigation framework for researching open-source supply chain attacks. It uses GitHub Archive (BigQuery), Wayback Machine, GitHub API, local git analysis, and IOC extraction to reconstruct attack timelines, recover deleted evidence, and produce forensic investigation reports. This is RAPTOR's most sophisticated module, spanning 9 specialized agent definitions, 5 skill files (1800+ lines combined), and a complete Pydantic v2 evidence schema system.
Supply chain attacks are one of the most critical threats in modern software development (SolarWinds, XZ Utils, event-stream, ua-parser-js). Hermes Agent currently has no capability for investigating suspicious activity in open-source repositories, recovering deleted content, or conducting structured forensic analysis. The existing OSINT skill (#355) covers public records investigation (campaign finance, government contracts) — a completely different domain.
This issue proposes an OSS Security Forensics skill that adapts RAPTOR's investigation framework into a Hermes Agent skill, enabling users to investigate suspicious GitHub activity, recover deleted commits and PRs, analyze supply chain compromise indicators, and produce structured forensic reports with evidence chains.
Research Findings
How RAPTOR's OSS Forensics Works
Architecture: Multi-Agent Investigation Framework
RAPTOR defines 9 specialized agents that work together in a 7-phase orchestration:
Phase 0: Initialize — Create investigation working directory, initialize evidence.json
Phase 1: Parse Prompt — Extract repos, actors, dates, URLs, and IOCs from investigation request
Phase 2: Parallel Evidence Collection — Spawn 5 specialist investigators simultaneously:
GH Archive Investigator — Queries tamper-proof GitHub event history via BigQuery. Covers all 12 GitHub event types (PushEvent, PullRequestEvent, IssuesEvent, etc.). Detects force pushes via zero-commit PushEvents, automation abuse via WorkflowRunEvents, and deleted content that remains in the immutable archive.
GitHub API Investigator — Queries current GitHub API for commits, issues, PRs, files, branches, tags, releases, forks. Cross-references with archive data to detect discrepancies (content present in archive but missing from current API = deletion detected).
Wayback Machine Investigator — Uses CDX API (
web.archive.org/cdx/search/cdx) to search archived snapshots of GitHub pages. Recovers deleted READMEs, issues, PRs, wiki pages, release notes, and fork network pages.Local Git Investigator — Analyzes the local git repository including dangling commits (force-pushed but not garbage-collected), reflog, blame, and diff analysis.
IOC Extractor — Extracts Indicators of Compromise from vendor security reports: commit SHAs, file paths, API keys, secrets, IP addresses, domains, package names, actor usernames.
Phase 3: Hypothesis Formation — An
oss-hypothesis-former-agentsynthesizes evidence from all 5 investigators into testable hypotheses. Each hypothesis must cite evidence by ID. Follow-up evidence requests trigger additional collection rounds (max N iterations).Phase 4: Evidence Verification — An
oss-evidence-verifier-agentvalidates each piece of evidence against original sources using aConsistencyVerifier.Phase 5: Hypothesis Validation — An
oss-hypothesis-checker-agentmechanically validates hypotheses:Phase 6: Report Generation — An
oss-report-generator-agentproduces the final forensic report.Phase 7: Completion
GitHub Archive Skill (958 lines)
The most detailed skill file. Key capabilities:
_TABLE_SUFFIXfilteringsafe_gharchive_query()wrapper with automatic cost controlsEvidence Schema System (Pydantic v2)
Complete typed evidence framework:
Deleted Commit Recovery Skill (303 lines)
Key insight: Force-pushed commits are NEVER deleted from GitHub servers.
3 access methods:
github.com/<owner>/<repo>/commit/<sha>.patch(append.patchor.diff)api.github.com/repos/<owner>/<repo>/git/commits/<sha>git fetch origin <sha>(works even after force push)Real-world example: Istio supply chain — recovered a PAT token from a force-pushed commit (worth $25k bounty).
Anti-Hallucination Patterns
Current State in Hermes Agent
What we have:
domain-intelskill — passive DNS/WHOIS/SSL reconnaissanceweb_extracttool — can fetch archived web pages, GitHub pagesterminaltool — can rungitcommands,curl, API callsexecute_codetool — can run Python scripts for data analysisdelegate_tasktool — can spawn parallel sub-agents (maps to RAPTOR's parallel investigators)github-*skills — repository management, issues, PRs, code reviewsession_search— past session recall (useful for multi-session investigations)What we don't have:
Relevant existing issues:
Implementation Plan
Skill vs. Tool Classification
This should be a skill because:
web_extract(Wayback Machine, GitHub pages),terminal(git commands, curl for GitHub API, BigQuery CLI),execute_code(Python data processing)delegate_tasksub-agentsBundled vs. Skills Hub: Recommend Skills Hub. OSS forensics is specialized (security teams, incident responders, open-source maintainers). Full capability requires BigQuery credentials for GitHub Archive access.
Category:
security(same category as Code Security Audit and Binary Security Analysis skills)What We'd Need
Phased Rollout
Phase 1: Basic Investigation Framework
Phase 2: Advanced Evidence Collection
delegate_task(spawn multiple evidence collectors)Phase 3: Full Forensic Workflow
Pros & Cons
Pros
web_extract,terminal,delegate_task,execute_codecover all data access needsgithub-*skills for repo access,domain-intelfor infrastructure OSINTCons / Risks
Open Questions
delegate_taskfor separate "hypothesis former" and "hypothesis checker" sub-agents, or handle it in a single conversation?References