Argus — The Hundred-Eyed AI Guardian
An Autonomous Security Auditor Powered by Google Gemini
Team Phalanx ⚔️
Inspiration
In Greek mythology, Argus Panoptes was a giant with a hundred eyes—the ultimate watchman who never fully slept. As software supply chain attacks and code vulnerabilities continue to surge, we asked ourselves: What if AI could watch code with the same unwavering vigilance?
The inspiration struck during late-night debugging sessions where security flaws—SQL injections, command injections, hardcoded secrets—lurked hidden in seemingly innocent code. Traditional static analysis tools generate noise; they flag issues but leave developers drowning in CVE reports. We envisioned an AI-powered guardian that doesn't just detect threats—it neutralizes them.
Argus was born from a simple philosophy: Security should be autonomous, not advisory.
What it does
Argus is a fully autonomous AI security auditor powered by Google Gemini 2.5 Flash. It operates as a multi-agent system that:
🔍 Scans — Batch-analyzes Python codebases for critical vulnerabilities (SQL Injection, Command Injection, Path Traversal, SSRF, Hardcoded Secrets)
🛡️ Patches — Automatically generates secure code fixes using industry best practices (parameterized queries, subprocess with
shell=False, environment variables for secrets)🧪 Verifies — Creates and executes
pytest-compatible reproduction tests to prove patches work before committing🔄 Self-Evolves — When patches fail repeatedly, a Meta-Programming agent rewrites its own skill files to handle the failure pattern—the AI literally teaches itself to become a better security engineer
👁️ Shadow Daemon Mode — A real-time filesystem watchdog that continuously monitors code for threats as developers write
All wrapped in a premium cyberpunk dashboard with live threat intelligence feeds, deep reasoning visualization, cost tracking, and neural voice feedback using Edge TTS.
How we built it
Argus is structured as a hierarchical multi-agent framework:
┌─────────────────────────────────────────────────┐
│ ManagerAgent (Orchestrator) │
│ ┌─────────────┬─────────────┬───────────────┐ │
│ │ ScannerAgent│ PatcherAgent│ ImproverAgent │ │
│ │ (Auditor) │ (Surgeon) │ (Meta-Brain) │ │
│ └─────────────┴─────────────┴───────────────┘ │
│ VerifierAgent │
│ (Quality Gate) │
└─────────────────────────────────────────────────┘
Tech Stack
| Component | Technology |
|---|---|
| LLM Backend | Gemini 2.5 Flash / 2.0 Flash with automatic model rotation |
| Core Language | Python 3.10+ |
| Dashboard UI | Rich (Live tables, layouts, panels) |
| Voice Engine | Edge TTS + Pygame |
| Filesystem Watcher | Watchdog |
| Retry Logic | Tenacity |
| Test Framework | pytest |
Skill System
Each agent reads Markdown "skill files" (audit_code.md, repair_code.md, generate_exploit.md) that define its behavior. This separation of concerns means the agents' capabilities can be upgraded without touching code—the ImproverAgent can even rewrite skills autonomously during operation.
Challenges we ran into
1. Rate Limit Hell
Gemini's API rate limits hit us hard during intensive multi-file scans. We implemented an automatic model rotation strategy that cascades through available models when errors occur:
$$ \text{gemini-2.5-flash} \xrightarrow{\text{fail}} \text{gemini-2.0-flash-exp} \xrightarrow{\text{fail}} \text{gemini-flash-latest} $$
This makes the system remarkably resilient to transient failures.
2. The Verification Paradox
Early versions would generate patches that looked correct but failed on edge cases. We solved this by requiring the AI to generate executable pytest tests for every patch. If the test fails, the patch is rejected and regenerated with error feedback—forming a closed-loop self-correction cycle:
$$ \text{Patch} \xrightarrow{\text{test}} \begin{cases} \text{Pass} \rightarrow \text{Apply} \ \text{Fail} \rightarrow \text{Regenerate}(+ \text{ErrorFeedback}) \end{cases} $$
3. Meta-Programming Stability
Letting an AI rewrite its own instructions is powerful but dangerous. We constrained the ImproverAgent to only add rules to existing skills, preserving format and structure. This prevents catastrophic drift while enabling incremental self-improvement.
4. Audio Threading Nightmares
The VibeEngine needed to speak asynchronously without blocking the main UI. We built a dedicated threaded worker with a queue.Queue that processes TTS requests in isolation, keeping the dashboard buttery smooth.
Accomplishments that we're proud of
🧬 Self-Evolving AI — Argus rewrites its own neural pathways when it fails. This is meta-programming in action—an AI that debugs itself.
💎 Zero False Acceptance — Every patch must pass automated verification tests. No fix goes live unless it's proven secure.
🎮 Stunning UX — The cyberpunk terminal dashboard with live threat feeds, thinking logs, and cost estimation makes security auditing feel futuristic.
🌐 Real-Time Guardian — Shadow Daemon mode transforms Argus from a tool into an omnipresent watchdog that protects code as it's written.
💰 CFO-Friendly — Built-in token tracking and cost estimation so you always know the economics of your security scans:
$$ \text{Cost} = \frac{T_{\text{input}}}{10^6} \times \$0.075 + \frac{T_{\text{output}}}{10^6} \times \$0.30 $$
Where $T_{\text{input}}$ and $T_{\text{output}}$ represent input and output token counts respectively.
What we learned
Structured JSON outputs from LLMs are game-changers. Using
response_mime_type="application/json"eliminated 90% of parsing headaches.Multi-agent architectures beat monolithic prompts. Breaking the problem into $\text{Scanner} \rightarrow \text{Patcher} \rightarrow \text{Verifier} \rightarrow \text{Improver}$ made each agent hyper-focused and reliable.
Skills as Markdown files are powerful. Treating prompts as editable documents enabled rapid iteration and even autonomous improvement.
Voice feedback adds soul. The VibeEngine's "System Online" greeting turns a CLI tool into an experience. Users feel protected.
What's next for Argus
| Phase | Feature | Description |
|---|---|---|
| 🌍 | Language Expansion | Support for JavaScript, TypeScript, Go, and Rust |
| 📊 | Attack Graph Visualization | Mermaid.js-powered exploit chain diagrams |
| 🔗 | CI/CD Integration | GitHub Actions / GitLab CI pipeline plugins |
| ☁️ | Cloud Deployment | FastAPI backend with real-time WebSocket dashboard |
| 🤝 | Human-in-the-Loop | Approval workflows for high-risk patches |
| 📜 | Compliance Mapping | OWASP Top 10 / CWE / CVE cross-referencing |
*"With a hundred eyes, nothing escapes Argus."* **👁️ Argus — The Hundred-Eyed AI Guardian 👁️** **⚔️ Team Phalanx ⚔️** *Powered by Google Gemini*
Log in or sign up for Devpost to join the conversation.