Shop Sentinel: The Story Behind the Shield 🛡️

Our Inspiration

Online shopping scams cost consumers billions of dollars annually. We've all been there—browsing a "too good to be true" deal, seeing countdown timers flashing "Only 2 left!", or discovering hidden shipping fees at checkout. The worst part? Many of these deceptive tactics are legal, leaving consumers feeling helpless.

The turning point: When one of our team members almost fell victim to a sophisticated phishing site that perfectly mimicked a well-known brand. The site looked legitimate, had HTTPS, and even had social media links (that turned out to be broken). It wasn't until after entering payment details that red flags became obvious—but by then, it was almost too late.

That's when we realized: What if there was a tool that could analyze websites in real-time, using AI to spot what humans might miss? A shield that protects you before you click "Buy Now."

Shop Sentinel was born from the frustration of navigating an increasingly treacherous online shopping landscape. We wanted to democratize security analysis—making enterprise-grade threat detection accessible to every shopper, powered by AI that runs entirely on-device, ensuring privacy and speed.

What We Built

Shop Sentinel is a Chrome Extension that acts as your personal shopping security guard. It analyzes e-commerce websites in real-time, detecting:

Dark Patterns: Manipulative UI designs that pressure you into purchases
Phishing & Impersonation: Fake sites mimicking legitimate brands
Security Vulnerabilities: Missing HTTPS, weak domain security, suspicious registrations
Policy Red Flags: Hidden fees, restrictive return policies, vague terms

All of this happens instantly using Chrome's built-in Gemini Nano AI model, running entirely on your device—no data leaves your computer.

How We Built It

The Architecture

Building Shop Sentinel was like creating a multi-layered security system. Here's how it all came together:

Frontend (The Brain):

React 19 with TypeScript for type safety and modern UI
Vite for lightning-fast development and optimized builds
TailwindCSS for a beautiful, responsive interface
Zustand for elegant state management

The AI Layer (The Intelligence):

Chrome's Built-in Gemini Nano via the LanguageModel API (Prompt API)
Custom RAG (Retrieval Augmented Generation) system using vector embeddings
Summarizer API for condensing complex policy documents
Smart caching to avoid redundant AI calls

Backend (The Orchestra):

Node.js with Express for job coordination
PostgreSQL for persistent result storage
WebSockets for real-time progress updates
WHOIS API integration for domain verification

Extension Architecture:

Manifest V3 for modern Chrome extension standards
Content scripts for page analysis
Service workers for background processing
Cross-tab synchronization for seamless experience

The Development Journey

Phase 1: Proof of Concept We started by building a simple heuristic analyzer that checked for basic red flags—missing HTTPS, no contact info, suspicious URLs. It worked, but we knew we could do better with AI.

Phase 2: AI Integration This is where things got exciting. Integrating Chrome's built-in AI meant learning a completely new API ecosystem. We spent days experimenting with prompts, fine-tuning temperature settings, and building a robust session management system. The "aha!" moment came when we realized we could use vector embeddings to create a lightweight knowledge base for dark pattern detection.

Phase 3: The RAG Revolution Instead of asking AI to analyze every page from scratch, we built a retrieval system that feeds AI relevant examples. This reduced processing time by 40% and improved accuracy. The math behind it?

Generate embeddings: $$E = \text{embed}(page_content)$$
Retrieve top-k similar cases: $$R = \text{search}(E, k=4)$$
Construct prompt: $$P = \text{combine}(R, page_content)$$
Get AI analysis: $$A = \text{LanguageModel.prompt}(P)$$

Phase 4: Real-World Testing We tested on 100+ e-commerce sites—from Amazon to sketchy dropshipping stores. Each test revealed new edge cases: sites with legitimate HTTPS but fake reviews, domains registered yesterday claiming to be "established since 1999," policy pages so vague they were essentially useless.

Phase 5: Performance & UX With AI processing taking 15-30 seconds, we needed intelligent caching, progress indicators, and graceful fallbacks. We implemented:

30-minute cache for AI results
Real-time progress updates via WebSockets
Cross-tab synchronization so analysis persists across browser tabs
Offline mode when AI isn't available

Challenges We Faced (And How We Overcame Them)

Challenge 1: Chrome AI API Availability

Problem: Chrome's built-in AI requires specific flags and model downloads. Users might not have it enabled.

Solution: Built graceful degradation—falls back to heuristic analysis when AI is unavailable. Added clear instructions and helpful console messages for users.

Challenge 2: False Positives

Problem: AI initially flagged every countdown timer as "dark pattern," but legitimate retailers like Amazon use flash sales too!

Solution: Implemented context-aware scoring with a trust factor T based on domain age:

$$T = \min\left(1.0, \frac{\text{domain_age_days}}{1095}\right)$$

A countdown timer on a 10-year-old domain with verified social media = low risk. Same timer on a 3-day-old domain = high risk. Trust factor dampens or amplifies risk signals accordingly.

Challenge 3: Policy Document Parsing

Problem: Policy pages contain thousands of words of legal jargon. How to extract what matters?

Solution: Combined Summarizer API with custom extraction: detect policy pages → Summarizer creates key-point summaries → parse to extract return windows, fees, and restrictions. Result: complex policies condensed into 3-5 bullet points.

Challenge 4: Real-Time Performance

Problem: AI analysis takes 15-30 seconds. Users don't want to wait staring at a spinner.

Solution: Implemented streaming progress updates showing analysis phases:

Heuristic checks (instant)
AI initialization (2-3s)
Dark pattern detection (5-8s)
Domain analysis (3-5s)
Legitimacy assessment (5-8s)

Progress bars and phase descriptions make the wait feel shorter. Aggressive caching: revisiting a page within 30 minutes shows results instantly.

Challenge 5: Cross-Tab Coordination

Problem: Users might analyze the same site in multiple tabs, causing duplicate analyses.

Solution: Built cross-tab synchronization using Chrome's BroadcastChannel API. When one tab completes analysis, all other tabs instantly receive results. Backend coordinates jobs to prevent race conditions.

What We Learned

This project was a crash course in modern web security, AI integration, and user experience design.

Technical Learnings

Chrome AI APIs are powerful but nuanced. The LanguageModel API behaves differently than cloud APIs—it requires careful session management, understands context better, and runs faster (no network latency). However, availability varies by device and requires user interaction for first-time downloads.
Vector embeddings are a game-changer for RAG. By pre-embedding known dark patterns and policy examples, we reduced token usage by 60% and improved consistency.
Manifest V3 has sharp edges. Migrating from V2 meant rethinking background scripts, message passing, and storage strategies. Service workers are more constrained but more performant.
Real-time UX requires careful state management. We learned to design for partial results, loading states, and error recovery. Users appreciate seeing progress over silence.

Domain Learnings

Dark patterns are everywhere—even on trusted sites. The line between "aggressive marketing" and "deceptive design" is blurry. Our AI uses industry-specific rules: what's acceptable for an established retailer might be a red flag for a new store.
Policy analysis reveals surprising insights. Many "legitimate" sites have policies that heavily favor the seller. Our legitimacy scoring helps users understand what "standard" really means.
Phishing sites are getting smarter. They now use HTTPS, register domains for months (not days), and create social media profiles. Our multi-signal approach catches what single checks miss.

Built With

Updates

Rajeev Ranjan Chaurasia started this project — Nov 01, 2025 01:01 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.