flowchart 1

Do Check Out All Links

FrameShift 🔍

Because every pixel tells a story

FrameShift is an AI-powered visual difference engine that transforms time-series image analysis into actionable insights. By fusing classical computer vision with deep learning, we automatically detect, classify, and visualize micro-changes across image sequences – turning hours of manual inspection into seconds of intelligent analysis.

Built for MoneyGram Haas F1 Hackathon 🏎️

💡 Inspiration

Our journey began in the high-stakes world of Formula 1, where millimeter-level design changes can mean the difference between podium and pit lane. We observed how technical delegates spend countless hours comparing car photographs to ensure regulatory compliance, while teams struggle to track competitor innovations across race weekends.

This challenge isn't unique to motorsports:

Semiconductor manufacturing: Defects cost billions annually
Infrastructure monitoring: Missed cracks can be catastrophic
Quality control: Manual inspection is slow, error-prone, and doesn't scale

We were inspired by:

F1's 3D laser scanning protocols for car verification – what if visual analysis could achieve similar precision without expensive hardware?
Google's Visual Inspection AI proving ML can match or surpass human inspectors
Research showing time-series visual analysis captures temporal dynamics that single-frame methods miss entirely

The MoneyGram Haas F1 Hackathon crystallized our vision: build a universal visual comparison engine that doesn't just detect changes, but understands them contextually.

🎯 What It Does

FrameShift provides intelligent, automated visual difference detection across image sequences with:

Core Capabilities

🔍 Multi-Scale Change Detection

Pixel-level differences for precise localization
Structural similarity analysis for subtle degradation detection
Optical flow tracking for motion and deformation patterns

🧠 AI-Powered Classification

Automatic change type identification (addition, removal, deformation, discoloration, displacement)
Confidence scoring and severity assessment
Temporal correlation analysis across image sequences

📊 Interactive Visualization

Real-time heatmap overlays showing change intensity
Bounding box annotations with classification labels
Adjustable sensitivity with instant visual feedback
Timeline view for tracking changes across sequences

⚡ Production-Ready Design

Async processing architecture for concurrent analysis
RESTful API for easy integration
WebSocket support for real-time updates
Scalable cloud deployment ready

Target Use Cases

Industry	Application	Expected Impact
Formula 1	Technical regulation compliance	Automate hours of manual inspection
Manufacturing	Production line defect detection	Real-time quality control
Infrastructure	Bridge/building degradation monitoring	Early warning system for maintenance
Brand Protection	Logo/packaging compliance verification	Ensure brand consistency
Medical Imaging	Tumor growth tracking over time	Support clinical decision-making

🛠️ How We Built It

Development Approach: Iterative 48-Hour Sprint

Phase 1: Core CV Pipeline

Implemented SIFT-based feature alignment with RANSAC outlier rejection
Built multi-scale differencing engine combining pixel, structural, and flow-based methods
Validated approach on synthetic test sequences with known changes

Phase 2: Neural Integration

Selected EfficientNet-B3 for defect classification (balance of speed/accuracy)
Integrated YOLOv8 for object-level change detection
Fine-tuned on publicly available defect datasets:
- NEU Surface Defect Database (steel surfaces)
- MVTec Anomaly Detection (15 object categories)
- Custom F1 technical image collection

Phase 3: Web Application

Architected FastAPI backend with Celery for async processing
Built React frontend with real-time WebSocket updates
Implemented interactive sensitivity adjustment with instant visual feedback

Phase 4: Optimization & Testing

Implemented multi-resolution cascade for speed optimization
Added illumination-invariant preprocessing pipeline
Stress-tested with various image types and lighting conditions

Technical Architecture

┌─────────────────────────────────────────────────────────────┐
│                    IMAGE SEQUENCE INPUT                      │
└───────────────────────────┬─────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│              LAYER 1: Intelligent Preprocessing              │
│  • Adaptive Histogram Equalization (CLAHE)                  │
│  • SIFT-based Registration & Homography Warping             │
│  • Bilateral Filtering (noise reduction)                    │
└───────────────────────────┬─────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│         LAYER 2: Multi-Scale Difference Detection            │
│  ┌─────────────┬──────────────────┬────────────────────┐   │
│  │ Pixel Diff  │  SSIM Analysis  │  Optical Flow      │   │
│  │ |I₂ - I₁|   │  Structural     │  Motion Vectors    │   │
│  └─────────────┴──────────────────┴────────────────────┘   │
│              D_final = α·D_pixel + β·(1-SSIM) + γ·||v||     │
└───────────────────────────┬─────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│     LAYER 3: Neural Classification & Semantic Understanding  │
│  ┌──────────────────┐          ┌─────────────────────┐     │
│  │  YOLOv8 (large)  │          │ EfficientNet-B3     │     │
│  │  Object Detect   │          │ Defect Classify     │     │
│  └──────────────────┘          └─────────────────────┘     │
│              Change Taxonomy & Temporal Correlation          │
└───────────────────────────┬─────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                   VISUAL OUTPUT GENERATION                   │
│  • Heatmap Overlays  • Bounding Boxes  • Difference Reports │
└─────────────────────────────────────────────────────────────┘

Key Algorithms

1. Hybrid Alignment (SIFT + ECC) Combines feature-based and intensity-based methods for robust alignment even with camera movement, rotation, and scale changes.

2. Multi-Scale Fusion

Pixel Difference: Direct intensity comparison
SSIM: Structural similarity to catch subtle degradation
Optical Flow: Motion tracking for deformation detection
Weighted Combination: Adaptive fusion based on image characteristics

3. Adaptive Thresholding Automatically determines optimal threshold based on image content distribution, eliminating manual tuning across different use cases.

🚧 Challenges We Ran Into

Challenge 1: The Alignment Problem

Problem: Even tripod-mounted cameras have micro-vibrations causing pixel misalignment, leading to false positive "changes" everywhere.

Solution: Developed hybrid SIFT + ECC alignment pipeline

Coarse alignment with feature matching (handles rotation/scale)
Fine alignment with intensity-based optimization (sub-pixel accuracy)
RANSAC outlier rejection for robustness

Expected Impact: Significant reduction in false positives while maintaining true change detection

Challenge 2: Illumination Variations

Problem: Outdoor monitoring faces dramatic lighting changes (clouds, time-of-day, seasons) that naive methods flag as "changes."

Solution: Illumination-Invariant Differencing

Convert to LAB color space (separates luminance from chrominance)
Apply Retinex algorithm to separate reflectance from illumination
Compare only reflectance components

Expected Impact: Robust operation across diverse lighting conditions

Challenge 3: Speed vs Accuracy Trade-off

Problem: High-resolution processing can be slow, but real-time QC applications need fast feedback.

Solution: Multi-resolution cascade

Fast pass (downsampled): Quick scan to identify suspicious regions
Detailed pass (full resolution, ROI only): Deep analysis on flagged areas

Expected Impact: Near real-time performance with minimal accuracy compromise

Challenge 4: Domain Adaptation

Problem: Optimal settings vary wildly across use cases (manufacturing defects vs brand compliance).

Solution: Adaptive thresholding + region weighting

Automatic threshold selection based on image statistics
Configurable region importance weights for domain-specific applications

Expected Impact: Minimal configuration needed for new use cases

🏆 Accomplishments We're Proud Of

✨ Complete Three-Layer Architecture - Designed and implemented end-to-end pipeline from preprocessing to visualization

🏎️ F1-Specific Innovation - Created region-weighted detection system tailored for aerodynamic component monitoring

🎨 Interactive UX - Built real-time sensitivity adjustment that updates visualizations instantly

🔬 Robust Preprocessing - Solved alignment and illumination challenges that plague naive difference methods

📦 Production Architecture - Designed async, scalable system ready for high-volume deployment

🧪 Synthetic Data Pipeline - Created automated test data generation for rapid validation

Formula 1 Use Case Demo Plan

Track changes across race weekends in areas like:

Front wing endplate geometry modifications
Rear wing flap angle adjustments
Floor edge wing element additions
Sensor/camera mount relocations
Livery sponsor logo updates

📚 What We Learned

1. Preprocessing is Critical

Initial prototyping revealed that robust preprocessing (alignment, normalization, noise reduction) is more impactful than complex models. Getting the inputs right enables simpler downstream processing.

2. Hybrid Approaches Win

Pure classical CV is fast but brittle. Pure deep learning requires massive datasets. Combining both gives us interpretability, speed, and the ability to handle complex patterns.

3. Domain Knowledge Matters

Understanding F1 technical regulations and aerodynamic design principles helped us focus detection on regions that actually matter, rather than treating all image areas equally.

4. Real-Time Feedback is Essential

Users don't want to wait minutes for results, then adjust settings and wait again. Instant visual feedback from sensitivity changes dramatically improves the user experience.

5. Synthetic Data Accelerates Development

Programmatically generating test cases with known changes let us validate algorithms quickly without waiting for real-world data collection.

🚀 What's Next for FrameShift

Immediate Priorities (Post-Hackathon)

🎥 Video Stream Processing

Extend to real-time video analysis
Temporal smoothing across frame sequences
Live camera feed integration

🤖 Model Refinement

Collect real F1 technical images for fine-tuning
Train custom models for specific change types
Implement active learning pipeline

📱 Mobile Deployment

On-device inference for field inspections
Offline-first architecture
Lightweight model variants

Medium-Term Goals

🌐 3D Change Detection

Stereo camera support
Depth-aware differencing
Volumetric change quantification

🏗️ Enterprise Features

Multi-tenant SaaS deployment
Role-based access control
Audit trails and compliance reporting

🔌 API Ecosystem

Pre-built integrations (QC systems, PLM software)
Webhook notifications
Batch processing capabilities

Long-Term Vision

🔮 Predictive Analytics

Time-series forecasting of degradation
Failure probability estimation
Maintenance scheduling optimization

🌍 New Domains

Satellite imagery analysis
Medical imaging applications
Security and surveillance

🏗️ Technical Architecture

System Components

Frontend (React + TypeScript)
    ↓ WebSocket + REST API
Backend (FastAPI)
    ├── NGINX (Reverse Proxy)
    ├── Uvicorn (ASGI Server)
    └── Celery Workers (Async Processing)
    ↓
Processing Engine
    ├── OpenCV (Computer Vision)
    ├── PyTorch (Neural Networks)
    └── NumPy/SciPy (Numerical Computing)
    ↓
Data Layer
    ├── PostgreSQL (Metadata)
    ├── MinIO/S3 (Image Storage)
    └── Redis (Task Queue)

Key Design Decisions

1. Async Architecture

Celery + Redis for distributed task processing
WebSocket for real-time progress updates
Non-blocking API design

2. Microservices Approach

Preprocessing service
Detection service
Classification service
Visualization service

3. Containerization

Docker for consistent deployment
Docker Compose for local development
Kubernetes-ready design

🛠️ Built With

Core Stack

Category	Technology	Purpose
Language	Python 3.11	Core processing logic
CV Framework	OpenCV 4.8+	Image processing, alignment
ML Framework	PyTorch 2.1	Neural network inference
Numerical	NumPy, SciPy	Mathematical operations
Frontend	React 18 + TypeScript	Interactive web UI
Backend	FastAPI	Async REST API
Task Queue	Celery + Redis	Distributed processing
Database	PostgreSQL	Metadata storage
Storage	MinIO (S3-compatible)	Image storage

ML Models (Planned)

EfficientNet-B3: Defect classification (good speed/accuracy balance)
YOLOv8: Real-time object detection for large changes
Custom fine-tuning: On F1-specific datasets

Infrastructure

Docker + Compose: Containerized services
NGINX: Reverse proxy, load balancing
Cloud Platform: AWS/GCP/Azure agnostic design

Key Datasets for Training

MVTec Anomaly Detection: 5,354 high-res images, 15 categories
NEU Surface Defect: 1,800 images of steel defects
COCO 2017: Pre-training for object detection
Custom F1 Collection: Technical documentation images

📊 Expected Performance Profile

Based on preliminary testing and similar systems:

Metric	Target	Notes
Latency	<100ms per pair	For real-time QC applications
Throughput	10+ FPS	Concurrent processing
Accuracy	Competitive with manual inspection	Human-level on clear cases
False Positives	Minimize with adaptive thresholding	Context-dependent