UrbanBuzz AI - Intelligent Navigation for Everyone

Inspiration

Have you ever ordered food from DoorDash and watched your driver struggle to find your exact location? Picture this: your delivery partner is just a mile away, but in a city like New York where all the buildings look the same, how can they feel confident they're standing at the right spot?

This problem isn't unique to delivery drivers. Think about:

  • People with disabilities who need precise, accessible routes with detailed information about every turn, curb, and obstacle
  • Tourists and newcomers navigating unfamiliar cities without knowing street names
  • Anyone who needs to see exactly where they're going, not just follow a blue line on a map

Traditional navigation apps tell you "Turn left on Main Street" - but what if you don't know which building is which? What if you're in a wheelchair and need to know about every curb cut and ramp? What if you need to see the actual location before you arrive?

We built UrbanBuzz AI to solve these real-world navigation challenges by combining AI-powered vision analysis, real-time accessibility scoring, and landmark-based directions that anyone can follow.


What it does

UrbanBuzz AI is an intelligent voice-controlled navigation platform that solves real-world navigation challenges:

🗣️ Voice-First Navigation

Natural conversation with AI that gives landmark-based directions ("Turn left at the blue building" instead of street names). Full voice control - no keyboard needed.

👁️ Visual Route Exploration

Step-by-step Street View images along your route with 4-directional views at every stop. AI analyzes each image for accessibility, safety, and route conditions.

♿ Accessibility-First Design

Deep accessibility analysis for wheelchair users and people with disabilities. Detailed curve analysis, real-time accessibility scoring, and step-by-step guidance with obstacle warnings.

🛡️ Safety Intelligence

Location-based safety statistics, real-time incident detection, and automatic re-routing for safer navigation.

🚀 Advanced Route Optimization

ML-powered route optimization with Graph Neural Networks, real-time traffic/weather integration, and 10.1x performance improvement through intelligent caching.

🎯 Smart Place Finding

Find closest transit stations, nearby places with ratings, and context-aware recommendations.


How we built it

Frontend Stack

Technology Purpose
Next.js + TypeScript Modern React framework for the web app
OpenAI Realtime API Real-time voice interaction with function calling
Google Maps Platform Directions, Street View, Places, Geocoding
Framer Motion Smooth animations and transitions

Key Components:

  • Voice Agent with WebRTC integration
  • Street View Gallery with step-by-step visualization
  • Safety Analysis overlay with comprehensive statistics
  • Navigation Maps with route overlays
  • Voice-controlled overlay system

Backend Stack

Technology Purpose
FastAPI + Python High-performance backend API
OR-Tools Advanced route optimization algorithms
PyTorch Graph Neural Networks for ML models
Redis Intelligent caching layer (10.1x speedup)
OpenAI Vision API Image analysis for accessibility

Key Services:

  • Route Optimization Engine (OR-Tools VRP solver)
  • ML Models (Graph Neural Networks for service time prediction)
  • Accessibility Engine (Computer vision analysis)
  • Real-time Data Integration (Google Maps, weather, incidents)

AI/ML Architecture

┌─────────────────────────────────────────┐
│         UrbanBuzz AI Stack              │
├─────────────────────────────────────────┤
│  🧠 Graph Neural Networks               │
│     └─ Service time prediction          │
│  👁️ Computer Vision                     │
│     └─ Accessibility & safety analysis  │
│  🔄 Warm-start Clustering               │
│     └─ Intelligent route initialization │
│  ⚠️ Risk Assessment Models              │
│     └─ Route safety prediction          │
└─────────────────────────────────────────┘

Challenges we ran into

1. Real-Time Voice Processing

Challenge: Integrating OpenAI Realtime API with WebRTC for seamless voice communication

Solution: Implemented custom WebRTC configuration with optimized audio codecs and SDP manipulation for high-quality audio

2. Accessibility Analysis at Scale

Challenge: Analyzing thousands of Street View images for accessibility in real-time

Solution: Implemented on-demand analysis (only when images are clicked) and intelligent caching to reduce API calls

3. Landmark-Based Navigation

Challenge: Teaching AI to avoid street names and use visual landmarks instead

Solution: Extensive prompt engineering with examples and explicit instructions to prioritize visual cues over street names

4. Route Optimization Performance

Challenge: Optimizing routes for 100+ locations with sub-second response times

Solution: Implemented intelligent caching (10.1x speedup), warm-start clustering, and ML-powered service time prediction

5. Multi-Modal Data Integration

Challenge: Combining real-time traffic, weather, incidents, and accessibility data

Solution: Built a unified data pipeline with Redis caching and async processing

6. Voice-Controlled UI

Challenge: Making all overlays and features controllable by voice for presentations

Solution: Implemented comprehensive function calling system with close_overlay and end_conversation functions


Accomplishments that we're proud of

🏆 Technical Excellence

  • 100% delivery success rate with multi-depot routing
  • 10.1x performance improvement with intelligent caching
  • Sub-100ms response times for optimized routes
  • 24.73 images/second image processing rate

♿ Accessibility Impact

  • Comprehensive accessibility analysis for wheelchair users, visual impairments, and mobility challenges
  • Deep curve analysis with 10-point detailed assessment for every turn
  • Real-time accessibility scoring using computer vision
  • Step-by-step guidance with obstacle warnings and alternative routes

🎯 User Experience

  • Full voice control - no keyboard needed for presentations
  • Landmark-based directions - easier to follow than street names
  • Visual route exploration - see exactly where you're going
  • Professional AI - confident, courteous, and reliable

🧠 AI Innovation

  • Graph Neural Networks trained on Atlanta data for accurate predictions
  • Computer vision analysis for accessibility and safety
  • Context-aware AI that uses existing route data efficiently
  • Natural conversation with function calling for seamless interaction

🏗️ Backend Architecture

  • Not just an AI wrapper - real engineering depth with OR-Tools optimization
  • ML models - trained Graph Neural Networks, not just API calls
  • Production-ready - scalable, maintainable, and well-documented
  • Atlanta-optimized - trained on local data for regional expertise

What we learned

Technical Learnings

  1. Voice AI Integration: Successfully integrated OpenAI Realtime API with WebRTC for real-time voice communication
  2. Route Optimization: Learned advanced OR-Tools techniques for multi-depot vehicle routing
  3. ML in Production: Implemented Graph Neural Networks for service time prediction with real-world data
  4. Performance Optimization: Achieved 10.1x speedup through intelligent caching and warm-start clustering

User Experience Insights

  • Landmarks > Street Names: Users find visual cues much easier to follow than street names
  • Accessibility is Critical: Detailed accessibility information is essential, not optional
  • Voice Control is Powerful: Full voice control enables hands-free operation and better presentations
  • Visual Context Matters: Seeing the actual location before arriving builds confidence

Product Development

  • Real Problems Need Real Solutions: The DoorDash delivery problem is real and affects many people
  • Accessibility is Universal Design: Features for people with disabilities benefit everyone
  • AI Should Be Invisible:** The best AI feels natural and doesn't remind you it's AI
  • Performance Matters: Sub-second response times are essential for user trust

What's next for UrbanBuzz AI

Short-Term (Next 3 Months)

  • [ ] Mobile App: Native iOS and Android apps for on-the-go navigation
  • [ ] Offline Mode: Download routes and images for offline navigation
  • [ ] Multi-Language Support: Expand to Spanish, French, and other languages
  • [ ] Enhanced Accessibility: Add support for hearing impairments with visual alerts

Medium-Term (6-12 Months)

  • [ ] AR Navigation: Augmented reality overlay showing directions in real-time
  • [ ] Predictive Routing: ML models that predict best routes based on time of day
  • [ ] Community Features: User-contributed accessibility data and route reviews
  • [ ] Enterprise Integration: API for delivery companies, ride-sharing, and logistics

Long-Term Vision

  • [ ] Global Expansion: Support for cities worldwide with local data training
  • [ ] Autonomous Vehicle Integration: Navigation system for self-driving vehicles
  • [ ] Smart City Integration: Real-time data from city infrastructure (traffic lights, sensors)
  • [ ] Accessibility Database: Comprehensive global database of accessible routes and locations

Impact Goals

Reduce Delivery Failures: Help delivery partners find exact locations, reducing failed deliveries by 50%

Improve Accessibility: Make navigation accessible for 1 million+ people with disabilities

Save Time: Reduce average navigation time by 30% through better route optimization

Build Confidence: Help people navigate unfamiliar locations with confidence


Try It Out

Experience UrbanBuzz AI and see how intelligent navigation can transform the way you move through cities. Whether you're a delivery driver, a person with a disability, or just someone who wants better directions, UrbanBuzz AI is here to help.

Key Features to Try

  1. Ask for directions using landmarks: "How do I get from Times Square to Central Park?"
  2. Request Street View images: "Show me images along the way"
  3. Get accessibility analysis: "What about accessibility along this route?"
  4. Find nearby places: "Where's the nearest coffee shop?"
  5. Check safety: "Show safety analysis for Atlanta, GA"

Voice Commands

  • "Close that" - Close any overlay
  • "End conversation" - End the session
  • Full voice control - no keyboard needed!

Built with ❤️ for accessible, intelligent navigation

Built With

Share this project:

Updates