Skip to content

couchbaselabs/pm_apps_celebtwin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎬 Bollywood CelebLookalike

Find your Bollywood celebrity doppelgänger using AI-powered facial recognition and vector search!


🌟 What It Does

Upload a photo of yourself, and the app instantly finds which Bollywood celebrity you resemble the most. Using advanced facial recognition AI and vector similarity search, the app compares your facial features against 12,000+ celebrity images to find your top matches.

Key Features:

  • 📸 Upload any photo with a visible face
  • 🎯 Get top 3 celebrity matches with similarity scores
  • 👨/👩 Filter results by gender (Male/Female)
  • ⚡ Real-time results powered by vector search
  • 🎬 100 Bollywood celebrities including Shah Rukh Khan, Deepika Padukone, Ranveer Singh, and more!

🧠 How It Works

The Logic Flow

┌─────────────────────────────────────────────────────────────────────────────┐
│                              USER UPLOADS PHOTO                              │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│  STEP 1: FACE DETECTION & EMBEDDING (Local - Your Machine)                  │
│  ─────────────────────────────────────────────────────────────────────────  │
│  • InsightFace AI model detects the face in your photo                      │
│  • Extracts 512 facial landmark points and features                         │
│  • Converts face into a 512-dimensional vector (embedding)                  │
│  • This vector is a mathematical representation of your facial features     │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│  STEP 2: VECTOR SIMILARITY SEARCH (Couchbase Capella)                       │
│  ─────────────────────────────────────────────────────────────────────────  │
│  • Your face embedding is sent to Couchbase                                 │
│  • Vector Search finds the most similar celebrity embeddings                │
│  • Uses dot product similarity to calculate match scores                    │
│  • Returns top matches ranked by similarity                                 │
└─────────────────────────────────────────────────────────────────────────────┘
                                      │
                                      ▼
┌─────────────────────────────────────────────────────────────────────────────┐
│  STEP 3: DISPLAY RESULTS                                                    │
│  ─────────────────────────────────────────────────────────────────────────  │
│  • Shows your photo alongside top 3 celebrity matches                       │
│  • Displays similarity percentage for each match                            │
│  • Celebrity name and gender badge shown                                    │
└─────────────────────────────────────────────────────────────────────────────┘

What is a Face Embedding?

A face embedding is a numerical representation of facial features. The AI model analyzes:

  • Face shape and structure
  • Eye position, size, and shape
  • Nose characteristics
  • Mouth and lip features
  • Jawline contours
  • Overall facial proportions

These features are encoded into a 512-number array (vector). Similar-looking faces have similar vectors, which allows us to find matches using mathematical distance calculations.

Similarity Score Explained

The match percentage represents how similar your facial features are to a celebrity:

  • 90%+ = Very strong resemblance
  • 70-90% = Notable similarity
  • 50-70% = Some shared features
  • Below 50% = Minimal similarity

The score is calculated using cosine similarity (dot product) between your face embedding and celebrity embeddings.


🗄️ Couchbase Services Used

1. Couchbase Capella (Database-as-a-Service)

The cloud-hosted NoSQL database that stores all celebrity data.

2. Document Storage (Key-Value)

Each celebrity image is stored as a JSON document:

{
  "type": "celebrity_face",
  "celebrity_id": 4,
  "celebrity_name": "Shah Rukh Khan",
  "filename": "0004_srk_42.jpg",
  "gender": "male",
  "embedding": [0.023, -0.045, 0.089, ... ] // 512 numbers
}

3. Vector Search (Search Service)

Couchbase's vector search capability enables:

  • Index: celebrity_face_index - indexes the embedding field
  • Dimensions: 512 (matching InsightFace output)
  • Similarity Metric: Dot Product
  • Optimized for: Recall (finding best matches)

The vector search performs Approximate Nearest Neighbor (ANN) search to efficiently find similar faces among 12,000+ embeddings in milliseconds.

4. Scopes & Collections

Organized data structure:

Bucket: celebrities
  └── Scope: scope
        └── Collection: celeb (12,094 documents)

🛠️ Tech Stack

Component Technology Purpose
Frontend HTML, CSS, JavaScript User interface
Backend FastAPI (Python) REST API server
Face AI InsightFace (buffalo_l) Face detection & embeddings
Database Couchbase Capella Document & vector storage
Search Couchbase Vector Search Similarity matching
Runtime Apple M4 Mac Local embedding generation

📊 Dataset

Bollywood Celebrity Faces Dataset

Metric Value
Total Images 12,094
Celebrities 100
Male Celebrities 49 (~5,085 images)
Female Celebrities 51 (~6,874 images)
Embedding Dimensions 512

Sample Celebrities:

  • Shah Rukh Khan, Salman Khan, Aamir Khan
  • Deepika Padukone, Priyanka Chopra, Alia Bhatt
  • Ranveer Singh, Hrithik Roshan, Ranbir Kapoor
  • Kareena Kapoor, Katrina Kaif, Anushka Sharma
  • ...and 88 more!

🏗️ Architecture Diagram

┌──────────────────────────────────────────────────────────────────────────┐
│                              CLIENT (Browser)                             │
│  ┌─────────────────────────────────────────────────────────────────────┐ │
│  │  • Upload photo                                                      │ │
│  │  • Select gender filter                                              │ │
│  │  • View results                                                      │ │
│  └─────────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼ HTTP POST /api/find-lookalike
┌──────────────────────────────────────────────────────────────────────────┐
│                         FASTAPI SERVER (Local)                            │
│  ┌─────────────────────────────────────────────────────────────────────┐ │
│  │  1. Receive uploaded image                                           │ │
│  │  2. Pass to InsightFace model                                        │ │
│  │  3. Get 512-dim face embedding                                       │ │
│  │  4. Query Couchbase vector search                                    │ │
│  │  5. Return matched celebrities                                       │ │
│  └─────────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────┘
           │                                          │
           ▼                                          ▼
┌─────────────────────────┐              ┌─────────────────────────────────┐
│   INSIGHTFACE (Local)   │              │     COUCHBASE CAPELLA (Cloud)   │
│  ┌───────────────────┐  │              │  ┌───────────────────────────┐  │
│  │ buffalo_l model   │  │              │  │  Vector Search Index      │  │
│  │ Face detection    │  │              │  │  12,094 celebrity docs    │  │
│  │ 512-dim embedding │  │              │  │  Dot product similarity   │  │
│  └───────────────────┘  │              │  └───────────────────────────┘  │
└─────────────────────────┘              └─────────────────────────────────┘

🔐 Privacy

  • Your photo never leaves your machine for AI processing
  • ✅ Only the numerical embedding (512 numbers) is sent to the cloud
  • ✅ Embeddings cannot be reversed to reconstruct faces
  • ✅ No photos are stored on servers

🚀 Quick Start

# 1. Activate environment
source venv/bin/activate

# 2. Start the server
uvicorn app.main:app --reload --port 8000

# 3. Open browser
open http://localhost:8000

📁 Project Structure

celebapp/
├── app/
│   ├── main.py              # FastAPI application
│   ├── embedding.py         # InsightFace face embedding
│   └── couchbase_client.py  # Couchbase vector search
├── static/
│   └── index.html           # Web UI
├── scripts/
│   ├── process_all_bollywood.py    # Dataset processing
│   └── add_bollywood_gender.py     # Gender labeling
├── data/
│   ├── bollywood_full_processed/   # Celebrity images
│   └── bollywood_full_embeddings/  # Pre-computed embeddings
├── requirements.txt
└── README.md

🎯 API Endpoints

Endpoint Method Description
/ GET Web UI
/health GET Health check
/api/find-lookalike POST Find celebrity matches
/api/celebrity-image/{filename} GET Get celebrity image

👨‍💻 Built With

  • Couchbase Capella - Cloud database with vector search
  • InsightFace - State-of-the-art face recognition
  • FastAPI - Modern Python web framework
  • Apple Silicon - Optimized for M-series Macs

Made with ❤️ using Couchbase Vector Search

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors