AI-powered tool that detects confusing moments in demo videos using timestamped clarity analysis — with optional roast mode.
Great ideas fail not because they’re bad, but because the demo explanation collapses under pressure. Traditional feedback like “be clearer” or “slow down” is vague and not actionable.
WaitWhat.ai analyzes demo videos and provides:
- ⏱️ Exact timestamps where clarity breaks
- 🧠 Detection of 6 clarity signals (rambling, grounding gaps, unsupported claims, etc.)
- 🔧 Actionable fixes for each flagged moment
- 🔥 Roast Mode (Kind → Honest → Brutal)
- 📈 A single clarity score summarizing the overall communication quality
- Upload demo or pitch video
- AI analyzes transcript and visual context
- View flagged segments with root-cause labels
- Jump to timestamps to see confusion in context
- Toggle Roast Mode to change feedback tone
- Review a rewritten pitch generated by AI
| Signal | Detects |
|---|---|
| Concept Spike | Too many buzzwords at once |
| Grounding Gap | Using terms before defining them |
| Trust-Me-Bro | Claims without proof or evidence |
| Visual Mismatch | Speech does not match what’s on screen |
| Structure Order | Disordered pitch structure |
| Ramble Ratio | Low information density & filler words |
Frontend (Next.js + Tailwind) ↓ Handles video upload Backend (FastAPI + Python) ↓ TwelveLabs: Transcript + timestamps ↓ Gemini: Segment-level clarity analysis ↓ Risk scoring + issue generation Frontend ↓ Video player with seek ↓ Issue list + roast mode ↓ Clarity score display
yaml Copy code
- User uploads video
- Backend stores + sends to TwelveLabs
- Transcript + timestamps returned
- Transcript windowed into 10s segments
- Gemini analyzes each segment (terms, claims vs evidence, grounding, ramble ratio)
- Risk score calculated per segment
- JSON results returned to frontend
{
"clarity_score": 63,
"segments": [
{
"segment_id": 1,
"start_sec": 10,
"end_sec": 20,
"label": "Buzzword Overdose",
"fix": "Introduce acronyms before using them.",
"tone": {
"kind": "This part introduces multiple concepts quickly.",
"honest": "Too many acronyms at once; define Marengo first.",
"brutal": "Acronym speedrun detected. Confusion level: maximum."
}
}
]
}
🛠️ Tech Stack Component Technology Frontend Next.js + Tailwind CSS Backend FastAPI (Python) Video Understanding TwelveLabs API LLM Analysis Google Gemini Deployment Vercel + Render
# Clone the repository, then:
cd backend
# Create and activate virtual environment
python -m venv .venv
# macOS / Linux
source .venv/bin/activate
# Windows PowerShell
# .venv\Scripts\Activate.ps1
# Install dependencies inside the venv
pip install -r requirements.txt
# Set API keys (or add them to a .env file)
export GEMINI_API_KEY=your-gemini-key
export TWELVE_LABS_API_KEY=your-twelvelabs-key
# Run the backend server with Uvicorn
uvicorn main:app --reload🚀 Future Enhancements Slide-to-speech alignment scoring
Real-time demo feedback
Team analytics dashboard
Audience-specific clarity tuning
🎉 Summary WaitWhat.ai turns vague demo feedback into precise, timestamped, and actionable clarity insights, wrapped in an optional humor layer. Ideal for improving pitches, onboarding demos, and tech presentations.