Do Check Out All Links
FrameShift 🔍
Because every pixel tells a story
FrameShift is an AI-powered visual difference engine that transforms time-series image analysis into actionable insights. By fusing classical computer vision with deep learning, we automatically detect, classify, and visualize micro-changes across image sequences – turning hours of manual inspection into seconds of intelligent analysis.
Built for MoneyGram Haas F1 Hackathon 🏎️
📋 Table of Contents
- Inspiration
- What It Does
- How We Built It
- Challenges We Ran Into
- Accomplishments We're Proud Of
- What We Learned
- What's Next
- Technical Architecture
- Built With
💡 Inspiration
Our journey began in the high-stakes world of Formula 1, where millimeter-level design changes can mean the difference between podium and pit lane. We observed how technical delegates spend countless hours comparing car photographs to ensure regulatory compliance, while teams struggle to track competitor innovations across race weekends.
This challenge isn't unique to motorsports:
- Semiconductor manufacturing: Defects cost billions annually
- Infrastructure monitoring: Missed cracks can be catastrophic
- Quality control: Manual inspection is slow, error-prone, and doesn't scale
We were inspired by:
- F1's 3D laser scanning protocols for car verification – what if visual analysis could achieve similar precision without expensive hardware?
- Google's Visual Inspection AI proving ML can match or surpass human inspectors
- Research showing time-series visual analysis captures temporal dynamics that single-frame methods miss entirely
The MoneyGram Haas F1 Hackathon crystallized our vision: build a universal visual comparison engine that doesn't just detect changes, but understands them contextually.
🎯 What It Does
FrameShift provides intelligent, automated visual difference detection across image sequences with:
Core Capabilities
🔍 Multi-Scale Change Detection
- Pixel-level differences for precise localization
- Structural similarity analysis for subtle degradation detection
- Optical flow tracking for motion and deformation patterns
🧠 AI-Powered Classification
- Automatic change type identification (addition, removal, deformation, discoloration, displacement)
- Confidence scoring and severity assessment
- Temporal correlation analysis across image sequences
📊 Interactive Visualization
- Real-time heatmap overlays showing change intensity
- Bounding box annotations with classification labels
- Adjustable sensitivity with instant visual feedback
- Timeline view for tracking changes across sequences
⚡ Production-Ready Design
- Async processing architecture for concurrent analysis
- RESTful API for easy integration
- WebSocket support for real-time updates
- Scalable cloud deployment ready
Target Use Cases
| Industry | Application | Expected Impact |
|---|---|---|
| Formula 1 | Technical regulation compliance | Automate hours of manual inspection |
| Manufacturing | Production line defect detection | Real-time quality control |
| Infrastructure | Bridge/building degradation monitoring | Early warning system for maintenance |
| Brand Protection | Logo/packaging compliance verification | Ensure brand consistency |
| Medical Imaging | Tumor growth tracking over time | Support clinical decision-making |
🛠️ How We Built It
Development Approach: Iterative 48-Hour Sprint
Phase 1: Core CV Pipeline
- Implemented SIFT-based feature alignment with RANSAC outlier rejection
- Built multi-scale differencing engine combining pixel, structural, and flow-based methods
- Validated approach on synthetic test sequences with known changes
Phase 2: Neural Integration
- Selected EfficientNet-B3 for defect classification (balance of speed/accuracy)
- Integrated YOLOv8 for object-level change detection
- Fine-tuned on publicly available defect datasets:
- NEU Surface Defect Database (steel surfaces)
- MVTec Anomaly Detection (15 object categories)
- Custom F1 technical image collection
Phase 3: Web Application
- Architected FastAPI backend with Celery for async processing
- Built React frontend with real-time WebSocket updates
- Implemented interactive sensitivity adjustment with instant visual feedback
Phase 4: Optimization & Testing
- Implemented multi-resolution cascade for speed optimization
- Added illumination-invariant preprocessing pipeline
- Stress-tested with various image types and lighting conditions
Technical Architecture
┌─────────────────────────────────────────────────────────────┐
│ IMAGE SEQUENCE INPUT │
└───────────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ LAYER 1: Intelligent Preprocessing │
│ • Adaptive Histogram Equalization (CLAHE) │
│ • SIFT-based Registration & Homography Warping │
│ • Bilateral Filtering (noise reduction) │
└───────────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ LAYER 2: Multi-Scale Difference Detection │
│ ┌─────────────┬──────────────────┬────────────────────┐ │
│ │ Pixel Diff │ SSIM Analysis │ Optical Flow │ │
│ │ |I₂ - I₁| │ Structural │ Motion Vectors │ │
│ └─────────────┴──────────────────┴────────────────────┘ │
│ D_final = α·D_pixel + β·(1-SSIM) + γ·||v|| │
└───────────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ LAYER 3: Neural Classification & Semantic Understanding │
│ ┌──────────────────┐ ┌─────────────────────┐ │
│ │ YOLOv8 (large) │ │ EfficientNet-B3 │ │
│ │ Object Detect │ │ Defect Classify │ │
│ └──────────────────┘ └─────────────────────┘ │
│ Change Taxonomy & Temporal Correlation │
└───────────────────────────┬─────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ VISUAL OUTPUT GENERATION │
│ • Heatmap Overlays • Bounding Boxes • Difference Reports │
└─────────────────────────────────────────────────────────────┘
Key Algorithms
1. Hybrid Alignment (SIFT + ECC) Combines feature-based and intensity-based methods for robust alignment even with camera movement, rotation, and scale changes.
2. Multi-Scale Fusion
- Pixel Difference: Direct intensity comparison
- SSIM: Structural similarity to catch subtle degradation
- Optical Flow: Motion tracking for deformation detection
- Weighted Combination: Adaptive fusion based on image characteristics
3. Adaptive Thresholding Automatically determines optimal threshold based on image content distribution, eliminating manual tuning across different use cases.
🚧 Challenges We Ran Into
Challenge 1: The Alignment Problem
Problem: Even tripod-mounted cameras have micro-vibrations causing pixel misalignment, leading to false positive "changes" everywhere.
Solution: Developed hybrid SIFT + ECC alignment pipeline
- Coarse alignment with feature matching (handles rotation/scale)
- Fine alignment with intensity-based optimization (sub-pixel accuracy)
- RANSAC outlier rejection for robustness
Expected Impact: Significant reduction in false positives while maintaining true change detection
Challenge 2: Illumination Variations
Problem: Outdoor monitoring faces dramatic lighting changes (clouds, time-of-day, seasons) that naive methods flag as "changes."
Solution: Illumination-Invariant Differencing
- Convert to LAB color space (separates luminance from chrominance)
- Apply Retinex algorithm to separate reflectance from illumination
- Compare only reflectance components
Expected Impact: Robust operation across diverse lighting conditions
Challenge 3: Speed vs Accuracy Trade-off
Problem: High-resolution processing can be slow, but real-time QC applications need fast feedback.
Solution: Multi-resolution cascade
- Fast pass (downsampled): Quick scan to identify suspicious regions
- Detailed pass (full resolution, ROI only): Deep analysis on flagged areas
Expected Impact: Near real-time performance with minimal accuracy compromise
Challenge 4: Domain Adaptation
Problem: Optimal settings vary wildly across use cases (manufacturing defects vs brand compliance).
Solution: Adaptive thresholding + region weighting
- Automatic threshold selection based on image statistics
- Configurable region importance weights for domain-specific applications
Expected Impact: Minimal configuration needed for new use cases
🏆 Accomplishments We're Proud Of
✨ Complete Three-Layer Architecture - Designed and implemented end-to-end pipeline from preprocessing to visualization
🏎️ F1-Specific Innovation - Created region-weighted detection system tailored for aerodynamic component monitoring
🎨 Interactive UX - Built real-time sensitivity adjustment that updates visualizations instantly
🔬 Robust Preprocessing - Solved alignment and illumination challenges that plague naive difference methods
📦 Production Architecture - Designed async, scalable system ready for high-volume deployment
🧪 Synthetic Data Pipeline - Created automated test data generation for rapid validation
Formula 1 Use Case Demo Plan
Track changes across race weekends in areas like:
- Front wing endplate geometry modifications
- Rear wing flap angle adjustments
- Floor edge wing element additions
- Sensor/camera mount relocations
- Livery sponsor logo updates
📚 What We Learned
1. Preprocessing is Critical
Initial prototyping revealed that robust preprocessing (alignment, normalization, noise reduction) is more impactful than complex models. Getting the inputs right enables simpler downstream processing.
2. Hybrid Approaches Win
Pure classical CV is fast but brittle. Pure deep learning requires massive datasets. Combining both gives us interpretability, speed, and the ability to handle complex patterns.
3. Domain Knowledge Matters
Understanding F1 technical regulations and aerodynamic design principles helped us focus detection on regions that actually matter, rather than treating all image areas equally.
4. Real-Time Feedback is Essential
Users don't want to wait minutes for results, then adjust settings and wait again. Instant visual feedback from sensitivity changes dramatically improves the user experience.
5. Synthetic Data Accelerates Development
Programmatically generating test cases with known changes let us validate algorithms quickly without waiting for real-world data collection.
🚀 What's Next for FrameShift
Immediate Priorities (Post-Hackathon)
🎥 Video Stream Processing
- Extend to real-time video analysis
- Temporal smoothing across frame sequences
- Live camera feed integration
🤖 Model Refinement
- Collect real F1 technical images for fine-tuning
- Train custom models for specific change types
- Implement active learning pipeline
📱 Mobile Deployment
- On-device inference for field inspections
- Offline-first architecture
- Lightweight model variants
Medium-Term Goals
🌐 3D Change Detection
- Stereo camera support
- Depth-aware differencing
- Volumetric change quantification
🏗️ Enterprise Features
- Multi-tenant SaaS deployment
- Role-based access control
- Audit trails and compliance reporting
🔌 API Ecosystem
- Pre-built integrations (QC systems, PLM software)
- Webhook notifications
- Batch processing capabilities
Long-Term Vision
🔮 Predictive Analytics
- Time-series forecasting of degradation
- Failure probability estimation
- Maintenance scheduling optimization
🌍 New Domains
- Satellite imagery analysis
- Medical imaging applications
- Security and surveillance
🏗️ Technical Architecture
System Components
Frontend (React + TypeScript)
↓ WebSocket + REST API
Backend (FastAPI)
├── NGINX (Reverse Proxy)
├── Uvicorn (ASGI Server)
└── Celery Workers (Async Processing)
↓
Processing Engine
├── OpenCV (Computer Vision)
├── PyTorch (Neural Networks)
└── NumPy/SciPy (Numerical Computing)
↓
Data Layer
├── PostgreSQL (Metadata)
├── MinIO/S3 (Image Storage)
└── Redis (Task Queue)
Key Design Decisions
1. Async Architecture
- Celery + Redis for distributed task processing
- WebSocket for real-time progress updates
- Non-blocking API design
2. Microservices Approach
- Preprocessing service
- Detection service
- Classification service
- Visualization service
3. Containerization
- Docker for consistent deployment
- Docker Compose for local development
- Kubernetes-ready design
🛠️ Built With
Core Stack
| Category | Technology | Purpose |
|---|---|---|
| Language | Python 3.11 | Core processing logic |
| CV Framework | OpenCV 4.8+ | Image processing, alignment |
| ML Framework | PyTorch 2.1 | Neural network inference |
| Numerical | NumPy, SciPy | Mathematical operations |
| Frontend | React 18 + TypeScript | Interactive web UI |
| Backend | FastAPI | Async REST API |
| Task Queue | Celery + Redis | Distributed processing |
| Database | PostgreSQL | Metadata storage |
| Storage | MinIO (S3-compatible) | Image storage |
ML Models (Planned)
- EfficientNet-B3: Defect classification (good speed/accuracy balance)
- YOLOv8: Real-time object detection for large changes
- Custom fine-tuning: On F1-specific datasets
Infrastructure
- Docker + Compose: Containerized services
- NGINX: Reverse proxy, load balancing
- Cloud Platform: AWS/GCP/Azure agnostic design
Key Datasets for Training
- MVTec Anomaly Detection: 5,354 high-res images, 15 categories
- NEU Surface Defect: 1,800 images of steel defects
- COCO 2017: Pre-training for object detection
- Custom F1 Collection: Technical documentation images
📊 Expected Performance Profile
Based on preliminary testing and similar systems:
| Metric | Target | Notes |
|---|---|---|
| Latency | <100ms per pair | For real-time QC applications |
| Throughput | 10+ FPS | Concurrent processing |
| Accuracy | Competitive with manual inspection | Human-level on clear cases |
| False Positives | Minimize with adaptive thresholding | Context-dependent |
These are design targets, not validated measurements
📜 License
This project is licensed under the MIT License.
🤝 Contributing
Built for the MoneyGram Haas F1 Hackathon. Future contributions welcome post-hackathon!
📞 Contact
Team: FrameShift
Hackathon: TrackShift Innovation Challenge
🙏 Acknowledgments
- MoneyGram Haas F1 Team for inspiring this challenge
- F1 Technical Working Group for domain insights
- Open-source computer vision community
- OpenCV, PyTorch, and FastAPI maintainers
FrameShift – Where vision meets precision. Every frame. Every change. Instantly. 🏎️✨
Built With
- axios
- databases
- efficientnet-b3
- fastapi
- framer
- javascript
- lucide
- motion
- nginx
- numpy
- onnx
- plotly.js
- postgresql
- pydantic
- python
- pytorch
- react
- redis
- scikit-image
- scipy
- shadcn/ui
- sql
- tailwindcss
- three.js
- torchvision
- typescript
- ultralytics
- uvicorn
- vite
- yolov8
- yolov8m
Log in or sign up for Devpost to join the conversation.