ReelGuardian

Inspiration

Short videos spread fast, but so do bad claims. Friends and family send clips every single day, with confident statements but no sources. We wanted a “pause → check → show sources” system that works in seconds, not hours.

What it does

User uploads a video, ReelGuardian transcribes it, pulls 3–6 fact-checkable claims, and returns verdicts: supported, refuted, or needs context. Each verdict includes a brief rationale, confidence score, and up to 3 citations, biased to primary sources (.gov/.edu, for example). ReelGuardian brings a visual summary (sampled frames from the content), so it also fact-checks based on what was on screen, not just what was said.

How we built it

Challenges we ran into

The problems and the solutions:

Cold starts & CUDA problems: fixed with a cron GPU warmer.
LLM JSON chaos: added a tolerant parser and a strict schema to keep the pipeline flowing.
Search noise: biased to primary sources, URL dedupe.
Timestamp alignment: mapping loose claims back to exact ASR segments for context.
API rate/latency spikes: model rotation and CPU fallback to avoid user-facing failures (OpenRouter).
Lack of visual information: passing sample frames as supplementary information for context.

Accomplishments that we're proud of

End-to-end upload → transcript → claims → verdicts with sources in one pass.
Solid Modal stack: cache-hot models, warmed GPU, identical behavior on web/worker.
Resilient LLM layer (multi-model + loose parsing) that doesn’t derail on imperfect outputs.
Clean, human-readable output with verdicts, confidence, and citations that people can trust.

What we learned

Infra simplicity wins: one pinned image + shared cache gets amazing speed
Warmers and caching matter more for UX than fancy prompts.
Primary-source (and simply sourcing the information) bias can improve trust (and judge confidence) immediately, which compensates for possible LLM slip-ups.

What's next for ReelGuardian

Browser extension: inline overlays on TikTok, YouTube Shorts, Reels.
Real-time mode for livestreams/spaces; incremental ASR + rolling verdicts.
Richer vision: reliable OCR for on-screen text, logo/channel provenance.
Proper dialogue isolation and voice detection (narrator vs dialogue, etc).
Multilingual ASR + diarization
Shareable reports and an API for moderators, creators, and newsrooms.

Built With

github
modal
nextjs
openrouter
tavily
whisper

Updates

Fausto Fang started this project — Aug 14, 2025 05:30 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.