Skip to content

Sanjith0/signvision-ar

Repository files navigation

SignVision Hybrid AR ๐Ÿš€โšก

Real-time hybrid AR object detection - Fast local YOLO + Accurate Gemini refinement!

๐ŸŽฏ Hybrid Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  COCO-SSD (Local)    โ†’  Instant AR Overlays โ”‚
โ”‚  โšก 10-30 FPS            โœ… Always visible   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Gemini (Backend)    โ†’  Label Refinement    โ”‚
โ”‚  ๐Ÿง  0.5 FPS              โœจ Better accuracy โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Best of both worlds:

  • โšก Fast: COCO-SSD gives instant visual feedback
  • ๐ŸŽฏ Accurate: Gemini refines labels in background
  • ๐ŸŽจ AR: Labels stick smoothly to objects in 3D space

๐ŸŒŸ Features

  • โœ… Instant Detection - COCO-SSD shows AR overlays immediately
  • โœจ Smart Refinement - Gemini upgrades labels for accuracy
  • ๐ŸŽฏ Advanced AR Tracking - Labels stick to objects (Google Lens style)
  • ๐Ÿ“ฑ Mobile Optimized - Works on iOS and Android
  • ๐Ÿ”Š Voice Feedback - Audio alerts for important signs
  • ๐Ÿ“น Dashcam Mode - Record video with detections
  • ๐ŸŒ PWA - Install as native app
  • ๐Ÿšถ Fall Detection - Emergency pause and recording

๐Ÿš€ Quick Start

1. Clone & Setup

git clone https://github.com/YOUR_USERNAME/SignVision-AR.git
cd SignVision-AR

# Install Python dependencies
pip install -r requirements.txt

# Set up Gemini API key
cp .env.example .env
# Edit .env and add your GEMINI_API_KEY

2. Get Gemini API Key

  1. Go to Google AI Studio
  2. Click "Create API Key"
  3. Copy and paste into .env file

3. Run the App

# Start backend (Gemini refinement)
python server.py

# In a new terminal, serve frontend
python -m http.server 8080

# Open http://localhost:8080

๐ŸŽฎ How It Works

Detection Flow

  1. Camera captures frame (1920x1080)
  2. COCO-SSD detects objects (50-150ms)
    • Shows AR overlays immediately
    • Labels: Traffic lights, stop signs, vehicles, pedestrians
  3. Gemini refines labels (background, every 2 seconds)
    • More accurate classification
    • Detects walk/no walk signals
    • Identifies specific sign types
    • Upgrades COCO labels with โœจ sparkle
  4. AR tracking keeps labels stuck (Google Lens style)
    • IoU matching
    • Motion prediction
    • Camera motion compensation
    • Exponential smoothing

Visual Indicators

  • Regular box (3px): COCO-SSD detection
  • Thick box (4px) + โœจ: Gemini-refined label
  • Dashed box: Predicted position (object not currently detected)
  • Glow effect: Active detection

๐ŸŽฏ What It Detects

COCO-SSD (Instant)

  • ๐Ÿšฆ Traffic lights
  • ๐Ÿ›‘ Stop signs
  • ๐Ÿš— Vehicles (cars, trucks, buses)
  • ๐Ÿšถ Pedestrians

Gemini (Refined)

  • ๐Ÿšฆ Walk/Don't Walk signals
  • ๐Ÿ›‘ All traffic signs (stop, yield, speed limit, etc.)
  • โš ๏ธ Road hazards
  • ๐Ÿšง Construction zones
  • More accurate labels

๐Ÿ“ฑ Deployment

Local Development

python server.py  # Backend on :8000
python -m http.server 8080  # Frontend on :8080

Production (Split Deployment)

Option 1: Backend on Render + Frontend on Vercel

  1. Deploy Backend (Render/Railway/Heroku):
# Push to GitHub
git push origin main

# On Render.com:
# - New Web Service
# - Connect repo
# - Build: pip install -r requirements.txt
# - Start: python server.py
# - Add environment variable: GEMINI_API_KEY
  1. Deploy Frontend (Vercel/Netlify):
# Update script.js config.apiEndpoint to your backend URL
# Then deploy to Vercel
vercel

Option 2: Single Server

  • Deploy entire app to one server
  • Backend serves API + static files
  • Simpler but less scalable

โš™๏ธ Configuration

Edit script.js:

config: {
    apiEndpoint: 'https://your-backend.onrender.com/analyze',
    processingInterval: 100,  // COCO-SSD speed (10 FPS)
    geminiInterval: 2000,     // Gemini frequency (0.5 FPS)
    minConfidence: 0.3        // Detection threshold
}

Adjust for your needs:

  • Faster COCO: Lower processingInterval (more CPU)
  • More Gemini: Lower geminiInterval (more API calls)
  • Less noise: Increase minConfidence

๐ŸŽจ Visual Guide

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  ๐Ÿ“ฑ Camera View                         โ”‚
โ”‚                                         โ”‚
โ”‚    โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“                    โ”‚
โ”‚    โ”ƒ ๐Ÿšฆ Traffic    โ”ƒ โ† COCO-SSD        โ”‚
โ”‚    โ”ƒ    Signal     โ”ƒ                    โ”‚
โ”‚    โ”—โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”›                    โ”‚
โ”‚                                         โ”‚
โ”‚    โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”“                   โ”‚
โ”‚    โ”ƒ โœจ Walk Signal โ”ƒ โ† Gemini refined โ”‚
โ”‚    โ”ƒ    - Green     โ”ƒ   (thicker glow) โ”‚
โ”‚    โ”—โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”โ”›                   โ”‚
โ”‚                                         โ”‚
โ”‚    โ” โ”„ โ”„ โ”„ โ”„ โ”„ โ”„ โ”“                    โ”‚
โ”‚    โ”† ๐Ÿ›‘ Stop Sign  โ”† โ† Predicted      โ”‚
โ”‚    โ”— โ”„ โ”„ โ”„ โ”„ โ”„ โ”„ โ”›   (dashed)        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“Š Performance

Metric Value
COCO-SSD Latency 50-150ms
COCO-SSD FPS 10-30 FPS
Gemini Latency 500-2000ms
Gemini Frequency 0.5 FPS (every 2s)
AR Tracking Smooth 60 FPS
Total Model Size ~13 MB (COCO-SSD only)

๐Ÿ’ฐ Cost Estimate

Gemini API (Free tier):

  • 15 requests per minute
  • 1,500 requests per day
  • ~$0.01 per 100 requests after free tier

Usage:

  • 0.5 requests/second = 30 requests/minute
  • ~1,800 requests/hour
  • Should stay within free tier for testing!

๐Ÿ› Troubleshooting

COCO-SSD works but no Gemini refinement

  • Check backend is running (python server.py)
  • Verify GEMINI_API_KEY in .env
  • Check browser console for API errors
  • Confirm config.apiEndpoint is correct

Slow performance

  • Increase processingInterval (lower FPS)
  • Increase geminiInterval (less refinement)
  • Use better device/browser

Labels not sticking

  • Enable device motion sensors in settings
  • Keep device steady during initial detection
  • Check AR tracking parameters in code

๐ŸŽฏ Architecture Benefits

Aspect Pure COCO-SSD Pure Gemini Hybrid (This!)
Speed โšก Instant ๐Ÿข Slow โšก Instant
Accuracy โœ… Good (70%) ๐ŸŽฏ Excellent (95%) ๐ŸŽฏ Excellent (95%)
Offline โœ… Yes โŒ No โš ๏ธ Partial
Cost ๐Ÿ’š Free ๐Ÿ’ฐ Paid ๐Ÿ’š Mostly Free
UX โšก Instant โฐ Laggy โšก Instant + Refined

๐Ÿ“š Tech Stack

  • Frontend: Vanilla JS (PWA)
  • Fast Detection: TensorFlow.js + COCO-SSD
  • Accurate Refinement: Google Gemini 2.0 Flash
  • Backend: FastAPI (Python)
  • AR Tracking: Custom (IoU, Kalman-like prediction)
  • Camera: WebRTC getUserMedia
  • Audio: Web Speech Synthesis
  • Storage: IndexedDB

๐Ÿ”’ Privacy

  • COCO-SSD: 100% local, no data sent
  • Gemini: Images sent to Google for refinement (optional)
  • No Tracking: No analytics, no user data collection
  • Works Offline: COCO-SSD continues without internet

๐Ÿ› ๏ธ Development

# Install dependencies
pip install -r requirements.txt

# Run with auto-reload
uvicorn server:app --reload --port 8000

# Run frontend
python -m http.server 8080

๐ŸŽ‰ Credits

Built with:


Made for accessibility. Empowering visually impaired users with real-time hybrid AR detection.

๐ŸŒŸ Star this repo if you find it useful!

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors