RVA Contract Lens

Inspiration

At the Hack for RVA kickoff, the Deputy Director of IT Strategy said something that stuck with us:

"It took me three days to get through all the contract materials to make sure the exact same purchase I made from the year before was still valid. It wasn't."

Richmond manages 1,365 contracts worth $6.76 billion across 37 departments. Staff review each one by manually searching multiple databases, reading hundreds of pages of PDFs, and checking federal compliance lists one by one. We wanted to ask: what if all of that information came to you instead of you going to find it?

What it does

RVA Contract Lens federates 8 independent data sources into a single AI-powered decision interface. A procurement officer selects a contract and gets a complete intelligence brief in ~8 seconds — compliance status, vendor history, price trends, concentration risk, contract terms from OCR'd PDFs, and public reputation — all cited and transparent.

The AI recommends. Humans decide. Every recommendation is saved to an audit trail so institutional knowledge isn't lost when staff turn over.

A separate public transparency view lets any Richmond resident explore where $6.76B in contracts goes — by department, vendor, or service — without filing a FOIA request.

How we built it

Frontend: Next.js 14 + shadcn/ui + Tailwind + Recharts + TanStack Table
Backend: FastAPI with 12 routers and 46 API endpoints
Data: DuckDB (embedded analytics) + ChromaDB (vector search over OCR'd contract PDFs)
AI: Groq (llama-3.3-70b) with structured JSON output and model fallback
Compliance: Live SAM.gov API + FCC Covered List + Consolidated Screening List — checked in parallel
OCR: unstructured library extracting 176K+ characters from scanned contract PDFs
Deployment: Docker Compose + Cloudflare Tunnel

The architecture is domain-agnostic: the federated intelligence pattern works for any civic domain where staff make recurring decisions across fragmented systems — grants, permitting, fleet maintenance.

Challenges we ran into

PDF extraction quality — scanned government contracts vary wildly in quality. We used the unstructured library with multiple parsing strategies and added fallback logic for partially extracted documents.
API rate limits — Groq's free tier caps at 30 requests/minute. We implemented model cascading (70b → 8b) and pacing logic to stay within limits during live demos.
Compliance data gaps — The Trade.gov Consolidated Screening List API was retired. SAM.gov's exclusions endpoint requires an Entity Management role we couldn't get in 36 hours. We implemented offline keyword matching as a working fallback.
Keeping AI transparent — It's easy to build a black box. We built three transparency layers: what data the AI saw, how each source influenced the recommendation (signed impact factors), and an exportable decision memo with full citations.

What we learned

Government procurement is a harder problem than it looks. The data exists but it's fragmented across federal, state, and local systems with no common schema. The real innovation isn't the AI — it's the federation layer that brings 8 sources together so a human can make a better decision faster.

What's next

Multi-city deployment. The platform takes a Socrata dataset ID as input — any city with open contract data can use the same architecture. Richmond is the proof of concept; the pattern scales nationally.