💡 Inspiration

Urban infrastructure maintenance is fundamentally broken because it operates on a "reactive" basis. Cities are essentially blind; a citizen might report a pothole, but legacy systems have no objective way to measure its severity remotely. Worse, these systems lack "memory." If a road is repaired and breaks again two weeks later, it's treated as a brand-new issue rather than a structural failure or a sign of poor workmanship.

I wanted to build a system that shifts urban governance from reactive repairs to proactive auditing. While standard AI scripts can detect a pothole, they can't reason about budget constraints, calculate deterministic safety risks, or remember environmental context. That’s why I engineered CivicSight: an autonomous, multi-agent auditor that actually understands the history and severity of the infrastructure it looks at.

⚙️ How I Built It

I designed CivicSight as a Sequential Multi-Agent Pipeline utilizing Google's Gemini 2.0 Flash model. The architecture is split into four distinct layers to ensure accuracy and prevent hallucinations:

  1. 👁️ Agent-V (The Perception Layer): When an image is uploaded via the Streamlit interface, this agent analyzes it using Gemini 2.0 Vision to extract structured metadata (e.g., damage type, severity scale 1-10, and environmental hazards like heavy traffic).
  2. 🧮 The Risk Engine (The Logic Layer): Instead of relying on an LLM to guess a safety score, I built a deterministic Python tool. It calculates a precise Risk Index (0-100) based on strict rules (e.g., if near_school == True, add 20 points).
  3. 🧠 Agent-M (The Context Layer): This agent queries a persistent JSON database to retrieve historical data for the specific location. It flags recurring issues, providing the system with "long-term memory."
  4. 👷 Agent-P (The Reasoning Layer): Finally, the Planner agent synthesizes the visual data, the hard-coded risk score, and the historical context to draft a concrete, 3-point technical remediation plan and budget estimate.

🚧 Challenges I Faced

The biggest challenge was AI Safety and Determinism. In critical civic infrastructure, you cannot have an AI "hallucinating" a safety score. I overcame this by strictly decoupling the Vision layer from the Risk layer. The AI is only allowed to observe and extract data; the actual risk calculation is handled by hard-coded Python logic.

Another hurdle was managing token limits and context windows as the incident database grew. I implemented a Context Compaction strategy where Agent-M summarizes historical logs into a dense format before passing them to the reasoning agent, keeping the system fast and cost-effective.

🎓 What I Learned

Building CivicSight deeply enhanced my understanding of Agentic Orchestration. I learned how to move beyond simple chat interfaces and build agents that use tools, query databases, and pass structured JSON data sequentially. I also learned the importance of Observability—by building a "Live Agent Trace" into the UI, I made the AI's thought process completely transparent, which is crucial for gaining the trust of city officials.

🚀 What's Next for CivicSight

The next step is integrating this pipeline with existing municipal CRM systems (like Open311) and adding a geospatial mapping dashboard using satellite imagery to predict infrastructure decay before it even happens.

Built With

Share this project:

Updates