Inspiration

Short videos spread fast, but so do bad claims. Friends and family send clips every single day, with confident statements but no sources. We wanted a “pause → check → show sources” system that works in seconds, not hours.

What it does

User uploads a video, ReelGuardian transcribes it, pulls 3–6 fact-checkable claims, and returns verdicts: supported, refuted, or needs context. Each verdict includes a brief rationale, confidence score, and up to 3 citations, biased to primary sources (.gov/.edu, for example). ReelGuardian brings a visual summary (sampled frames from the content), so it also fact-checks based on what was on screen, not just what was said.

How we built it

Challenges we ran into

The problems and the solutions:

  • Cold starts & CUDA problems: fixed with a cron GPU warmer.
  • LLM JSON chaos: added a tolerant parser and a strict schema to keep the pipeline flowing.
  • Search noise: biased to primary sources, URL dedupe.
  • Timestamp alignment: mapping loose claims back to exact ASR segments for context.
  • API rate/latency spikes: model rotation and CPU fallback to avoid user-facing failures (OpenRouter).
  • Lack of visual information: passing sample frames as supplementary information for context.

Accomplishments that we're proud of

  • End-to-end upload → transcript → claims → verdicts with sources in one pass.

  • Solid Modal stack: cache-hot models, warmed GPU, identical behavior on web/worker.

  • Resilient LLM layer (multi-model + loose parsing) that doesn’t derail on imperfect outputs.

  • Clean, human-readable output with verdicts, confidence, and citations that people can trust.

What we learned

  • Infra simplicity wins: one pinned image + shared cache gets amazing speed

  • Warmers and caching matter more for UX than fancy prompts.

  • Primary-source (and simply sourcing the information) bias can improve trust (and judge confidence) immediately, which compensates for possible LLM slip-ups.

What's next for ReelGuardian

  • Browser extension: inline overlays on TikTok, YouTube Shorts, Reels.

  • Real-time mode for livestreams/spaces; incremental ASR + rolling verdicts.

  • Richer vision: reliable OCR for on-screen text, logo/channel provenance.

  • Proper dialogue isolation and voice detection (narrator vs dialogue, etc).

  • Multilingual ASR + diarization

  • Shareable reports and an API for moderators, creators, and newsrooms.

Built With

  • github
  • modal
  • nextjs
  • openrouter
  • tavily
  • whisper
Share this project:

Updates