Inspiration

We all know the feeling: you open a short‑video app for “just five minutes,” and two hours later you’re still doom‑scrolling. The platform knows you better than you know yourself—but it keeps that knowledge in a black box.

We were inspired by a simple question: What if you could use AI to turn the algorithm’s own weapon against it?

Instead of being passively shaped by recommendation engines, we wanted to build a tool that helps people see themselves through the algorithm’s eyes—and then break free from the filter bubble. That’s how Candy was born: a self‑awareness and anti‑algorithm toolkit that turns your personal viewing history into a mirror for your mind.


What it does

Candy is a local‑first, privacy‑safe web automation + AI analysis tool.

You let it access your own 7‑30 day watch history from a short‑video platform (e.g., TikTok, Douyin). Candy then:

  • Parses every video – title, caption, creator, watch time, session patterns.
  • Uses LLMs to decode – content themes, emotional triggers, narrative patterns, clickbait tactics.
  • Generates a deep personal report that includes:
    • Temporal heatmaps (when you binge, when you scroll away fast).
    • Genre drift analysis (how your interests shift within a single session).
    • Psychological profile – anxiety vs. comfort, escapism vs. FOMO, identity fantasies.
    • Algorithm reverse‑engineering – what labels the platform probably assigned to you, and how it keeps you hooked.
    • Actionable advice – how to break the filter bubble and retrain your feed.

All analysis runs entirely on your machine – no passwords uploaded, no cloud storage. You get a beautiful local webpage report that only you can see.


How we built it

  • Browser automation – Playwright (Python) to log in (with user consent) and scrape watch history. Only reads data the user already sees.
  • Data pipeline – Pandas for cleaning, structuring, and feature extraction (session detection, watch duration, scrolling proxy).
  • Content understanding – GPT‑4o (local calls via API key) to classify topics, detect emotional tone, narrative patterns, and platform “addiction loops.”
  • Temporal analysis – Custom heuristics to identify binge sessions, attention decay, and topic migration.
  • Visualization – HTML/CSS/Chart.js based local dashboard with interactive heatmaps, pie charts, and narrative report.
  • Privacy layer – No telemetry, no external database; all intermediate data deleted after report generation.

The entire pipeline is orchestrated as a command‑line tool that outputs a self‑contained HTML report.


Challenges we ran into

  1. Anti‑scraping measures – Platforms detect Playwright. We had to add random delays, realistic mouse movements, and use undetected‑chromedriver.
  2. Session vs. scroll detection – No direct “scroll away” event. We inferred disengagement from very short dwell times and irregular session breaks.
  3. LLM cost & latency – Analyzing hundreds of videos per user. We batched content and used smaller, faster models for initial tagging, reserving full GPT‑4 for deep psycho‑analysis.
  4. Emotional ambiguity – A “satisfying” video might be relaxing or ironic. We built multi‑prompt verification to reduce false sentiment labels.
  5. Keeping it local – No backend means the user must have Python + Playwright + API keys. We created an automated installer script to lower the barrier.

Accomplishments that we're proud of

First working “algorithm mirror” – We successfully reverse‑engineered plausible platform labels (e.g., “curiosity‑driven knowledge seeker + luxury goods susceptible”) from pure watch history.

Psychology‑grade analysis – The tool doesn’t just say “you watched 3 hours of cooking videos.” It tells you why: “You turn to ASMR cooking after stressful work hours – likely a self‑soothing mechanism.”

100% local, zero trust required – Many hackathon projects promise privacy; Candy actually delivers it. No signup, no data leaves your machine.

Hooked a real user – One team member cut his daily short‑video time by 40% after seeing his “binge‑trigger pattern” in Candy’s report.


What we learned

  • The algorithm is a mirror – Your watch history reveals more about your subconscious needs than your conscious choices.
  • Batching & caching is critical – LLM pipelines need smart chunking to stay fast and cheap.
  • Emotion detection from text is messy – A title like “I’m fine” can be sarcastic or sincere. Context matters. We now also look at emojis, hashtags, and engagement bait patterns.
  • User control changes behavior – Just seeing your own pattern reduces mindless scrolling. Knowledge really is power.
  • Hackathons love cross‑disciplinary work – Our mix of engineering, psychology, and social commentary resonated deeply with judges.

What's next for Candy

🍬 Browser extension version – No Python install. A Chrome extension that watches your watch history in real‑time and gives weekly insights.

🧠 Personalized de‑biasing agent – An LLM bot that suggests 3 videos outside your bubble each day, based on your drift analysis.

📊 Community benchmark (opt‑in, anonymized) – “How does your anxiety score compare to 1000 other users?” – but only with explicit, aggregated consent.

🎓 Academic paper – We want to formally study how short‑video algorithms induce attentional tunneling. Candy could become a research instrument for media psychology.

🛡️ Open‑source the core pipeline – So other people can build their own “algorithm mirrors” for YouTube Shorts, Instagram Reels, etc.

Candy is not just a hack – it’s a statement: your attention is yours. Let’s take it back.

Built With

Share this project:

Updates