Agent Identity Passport

dashboard interface
audit trail
timeline logs
passport agent identity

🛂 Agent Identity Passport

Inspiration

AI agents are increasingly managing production infrastructure — restarting services, rolling back deployments, scaling pods. But when something goes wrong at 3am, nobody knows which agent did what, whether it was authorized, or how to audit the damage.

We asked: What if every AI agent had to show a passport before touching production?

What We Built

A verifiable identity and trust framework for AI agents — inspired by real-world standards like IETF Agent Authorization Protocol (AAP) and zero-trust security principles used at Okta and Auth0.

Every agent gets a cryptographically signed JWT passport with:

A trust level (HIGH / MEDIUM / LOW)
A policy defining exactly what it can and cannot do
An expiry time
A reputation score that degrades with failures

How It Works

Chaos Event Fires ↓ Agent Requests Passport (name + type + trust level) ↓ Policy Engine checks rules → ALLOW or DENY ↓ Agent presents Passport to perform action ↓ Every action logged to immutable audit trail ↓ Incident marked RESOLVED with resolution time

Key Features

🔥 Chaos Simulator — Fire service_down, bad_deploy, memory_leak, cache_miss events
⚡ Auto-Healing Pipeline — Chaos fires → agent dispatched → resolved in milliseconds, fully automated
🔐 Policy Engine — JSON rules defining what each agent type can/cannot do
🌐 Service Monitor — Real URL health monitoring with auto-recovery
📊 Incident Timeline — Visual DETECTED → DISPATCHED → RESOLVED flow
🏆 Reputation Scoring — Track agent reliability, auto-block bad agents
🌍 Multi-tenant — Isolated environments per organization
📋 Audit Trail — Immutable log of every agent action
📈 Prometheus Metrics — Full observability at /metrics
🔌 WebSocket — Real-time push updates to dashboard

Load Test Results

100 concurrent users
4,553 total requests
0% failure rate
76 req/sec average
8ms average latency

Challenges We Faced

Migrating from SQLite to PostgreSQL mid-hackathon while keeping all features working
Getting WebSocket (Flask-SocketIO + Eventlet) to work correctly with the template stack
Deploying to Railway with the correct DATABASE_URL format and environment variable references
Designing a trust model that is both secure and flexible enough for real production use cases

What We Learned

How real SRE teams implement zero-trust for automated systems
The importance of audit trails in production — not just for debugging but for compliance
How chaos engineering reveals hidden failure modes
Railway deployment with PostgreSQL and WebSocket support