Inspiration

Poor communication in product demos and technical presentations costs U.S. businesses over (1.2 \text{ trillion USD}\) annually.

While watching demos and presentations, we noticed that feedback is often vague and overly focused on delivery, without pinpointing where clarity actually breaks down.

We wanted to build a system that could tell presenters exactly when their demo stopped making sense — and how to fix it — using concrete, timestamped feedback.

What it does

Given a recorded demo and user preferences, the system: • Breaks the video into timestamped segments • Identifies issues such as cognitive overload, unsupported claims, and poor structure • Computes an overall clarity score • Provides precise timestamps, actionable fixes, and a fully rephrased pitch

Instead of vague advice, users receive clear, actionable guidance to improve their demos.

How we built it • Used TwelveLabs to index demo videos and extract transcripts with precise timestamps • Analyzed each transcript segment using LLMs to compute communication signals such as: • Concept density • Grounding gaps • Structure violations • Rambling • Aggregated these signals into a risk score to flag problematic moments • For flagged segments, generated: • Labeled issues • Actionable suggestions • Alternative rephrasings • Built a frontend UI that allows users to: • Jump directly to problem timestamps • Review fixes and the rewritten pitch

Challenges we ran into • Defining and quantifying clarity in a measurable way • Avoiding over-flagging while still catching real communication issues • Designing LLM prompts that produce consistent and explainable feedback • Ensuring transcript timestamps align accurately with spoken content

Accomplishments that we’re proud of • Built an end-to-end pipeline from video upload to actionable insights • Designed interpretable signals for cognitive load and structure • Delivered timestamped feedback instead of generic suggestions • Generated a fully rephrased demo pitch, not just isolated comments

What we learned • Communication clarity can be treated as an engineering problem • LLMs are most effective when guided by structured signals • Users value specific, timestamped feedback far more than abstract advice • Combining video understanding with language models unlocks powerful feedback workflows

What’s next for WaitWhat.ai • Visual–verbal alignment analysis (slides vs. speech) • Team-level clarity analytics across multiple demos • Real-time feedback during live presentations • Deeper personalization based on audience type and context

Share this project:

Updates