CivicSight: AI-Powered Infrastructure Auditor

The CivicSight launch screen, prompting the user to enter an Incident Context location to activate the multi-agent pipeline.
Entering the location address (e.g., MG Road, New Delhi) sets the parameters for the Memory Agent before uploading site imagery.
Uploading a street hazard photo and initiating the autonomous audit by triggering the "Run Audit Pipeline" execution.
The Vision Data output panel detailing the detected damage type (Pothole), a 6/10 visual severity score, and environmental hazards.
History Log querying persistent memory to identify recurring infrastructure issues.
Agent-P's drafted technical Repair Plan alongside the "Live Agent Trace," ensuring full observability of the AI's thought process.

💡 Inspiration

Urban infrastructure maintenance is fundamentally broken because it operates on a "reactive" basis. Cities are essentially blind; a citizen might report a pothole, but legacy systems have no objective way to measure its severity remotely. Worse, these systems lack "memory." If a road is repaired and breaks again two weeks later, it's treated as a brand-new issue rather than a structural failure or a sign of poor workmanship.

I wanted to build a system that shifts urban governance from reactive repairs to proactive auditing. While standard AI scripts can detect a pothole, they can't reason about budget constraints, calculate deterministic safety risks, or remember environmental context. That’s why I engineered CivicSight: an autonomous, multi-agent auditor that actually understands the history and severity of the infrastructure it looks at.

⚙️ How I Built It

I designed CivicSight as a Sequential Multi-Agent Pipeline utilizing Google's Gemini 2.0 Flash model. The architecture is split into four distinct layers to ensure accuracy and prevent hallucinations:

👁️ Agent-V (The Perception Layer): When an image is uploaded via the Streamlit interface, this agent analyzes it using Gemini 2.0 Vision to extract structured metadata (e.g., damage type, severity scale 1-10, and environmental hazards like heavy traffic).
🧮 The Risk Engine (The Logic Layer): Instead of relying on an LLM to guess a safety score, I built a deterministic Python tool. It calculates a precise Risk Index (0-100) based on strict rules (e.g., if near_school == True, add 20 points).
🧠 Agent-M (The Context Layer): This agent queries a persistent JSON database to retrieve historical data for the specific location. It flags recurring issues, providing the system with "long-term memory."
👷 Agent-P (The Reasoning Layer): Finally, the Planner agent synthesizes the visual data, the hard-coded risk score, and the historical context to draft a concrete, 3-point technical remediation plan and budget estimate.

🚧 Challenges I Faced

The biggest challenge was AI Safety and Determinism. In critical civic infrastructure, you cannot have an AI "hallucinating" a safety score. I overcame this by strictly decoupling the Vision layer from the Risk layer. The AI is only allowed to observe and extract data; the actual risk calculation is handled by hard-coded Python logic.

Another hurdle was managing token limits and context windows as the incident database grew. I implemented a Context Compaction strategy where Agent-M summarizes historical logs into a dense format before passing them to the reasoning agent, keeping the system fast and cost-effective.

🎓 What I Learned

Building CivicSight deeply enhanced my understanding of Agentic Orchestration. I learned how to move beyond simple chat interfaces and build agents that use tools, query databases, and pass structured JSON data sequentially. I also learned the importance of Observability—by building a "Live Agent Trace" into the UI, I made the AI's thought process completely transparent, which is crucial for gaining the trust of city officials.

🚀 What's Next for CivicSight

The next step is integrating this pipeline with existing municipal CRM systems (like Open311) and adding a geospatial mapping dashboard using satellite imagery to predict infrastructure decay before it even happens.

Built With

google-gemini
hugggingfaces
json
pandas
python
streamlit

Submitted to

Code Gemini

Created by

For the CivicSight project, I served as the primary developer and architect, responsible for the end-to-end design, implementation, and deployment of the autonomous infrastructure auditing system. My contributions focused on creating a transparent, reliable, and intelligent pipeline for smart city governance.

Key Technical Contributions
Multi-Agent Orchestration: I architected and implemented a sequential multi-agent pipeline using Google Gemini 2.0 Flash to handle complex tasks across distinct layers of perception, logic, and reasoning.

Vision-Logic Decoupling: To ensure safety-critical accuracy, I developed a custom deterministic Risk Engine in Python, separating visual observation (Agent-V) from the actual risk calculation to prevent AI hallucinations.

Long-Term Memory System: I engineered a persistent memory layer using a JSON database that allows the system to identify recurring infrastructure issues at specific locations, providing long-term contextual awareness.

Observability Framework: I built a Live Agent Trace feature into the Streamlit UI, which displays real-time logs of the agents' thought processes to ensure the system is explainable and trustworthy for city officials.

Context Engineering: I implemented context compaction strategies to efficiently summarize historical incident logs, ensuring the system remains fast and cost-effective while processing growing amounts of data.

System Features Implemented
Quantified Risk Assessment: Automated the transformation of raw images into a structured Risk Index (0-100) based on severity and environmental metadata like proximity to schools or heavy traffic.

Autonomous Planning: Developed the logic for Agent-P to synthesize all data points and instantly generate a 3-point technical repair plan and budget estimate.

Full-Stack Deployment: Built the entire user interface using Streamlit and successfully deployed the application to Hugging Face Spaces for real-time interaction.

Shreyas Patankar

Updates

Shreyas Patankar started this project — Mar 19, 2026 01:51 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.