Inspiration
Every marketer ships content into a black box and waits days for engagement data to tell them it flopped. But the real signal — attention, emotion, memory, reward — happens in the brain in the first two seconds. Meta’s TRIBE v2 showed you can predict fMRI-style cortical response to multimodal stimuli. We asked: what if a creator could watch their audience’s brain react to a cut and get fix-it notes grounded in actual neuroscience — before publishing?
What it does
Reads the cortex. Drop in a video, image, script, or YouTube link. Synapse predicts the per-frame response across 10 named cortical regions on the fsaverage5 surface (20,484 vertices), rendered as a live, drag-to-rotate 3D wireframe brain that animates in sync with the video scrubber. Three lenses, one run. Toggle Creator (viral score, hook moments, retention & drop-off map), Editor (cut guidance: TRIM / B-ROLL / CTA with predicted % impact and exact timestamps), and Researcher (per-region activation tables, ±1σ sensitivity analysis, world-state “context rewiring,” full methods). Grounds every recommendation in real science. A RAG layer retrieves from 6,827 marketing-psychology, neuromarketing, and consumer-behavior papers (HyDE query expansion → pgvector → LLM cross-encoder rerank → MMR diversification) so the coach’s advice cites real literature instead of vibes. Stays honest. Every reading is tier-badged — real multimodal TRIBE v2, baked cached run, or linguistic fallback — so it never fakes a “brain reading.” How we built it
Frontend: Next.js 16 (App Router), React 19, Three.js / React-Three-Fiber for the hollow fsaverage5 brain mesh, Framer Motion, Tailwind v4. Fully themed light/dark editorial UI. Brain encoder: Meta TRIBE v2 (V-JEPA2 video + Wav2Vec-BERT audio + Llama-3.2 text backbones) served on Modal (A10G GPU, scale-to-zero), projected from 20,480 vertices onto named ROIs — with a deterministic linguistic-feature encoder as a zero-dependency fallback so the brain always animates. RAG: OpenAI text-embedding-3-small → Supabase pgvector, HyDE + cross-encoder rerank + MMR. Corpus built from Semantic Scholar + arXiv. Pipelines: browser-side ffmpeg.wasm chunking + Whisper transcription (dodges serverless ffmpeg limits), YouTube transcript/thumbnail evidence extraction, FastAPI backend, Supabase history. Deploy: Vercel (Fluid Compute) + Supabase + Modal. Challenges we ran into
Serving a 10 GB multimodal GPU model serverlessly — cold starts, gated Llama-3.2 weights, scale-to-zero economics. Syncing the 3D brain heat to the video playhead frame-perfectly without re-rendering React every tick. Keeping it honest — building a tiered source system so we never claim a real cortical reading when we fall back to heuristics. RAG signal-vs-noise — academic abstracts are messy, so HyDE + rerank + MMR were essential to surface actually-relevant marketing psychology. Accomplishments we’re proud of
A Results page that isn’t a mock — the production UI rendering against a real cached neural run, explorable with no sign-in. A 6,827-paper neuromarketing knowledge base that makes the AI coach cite its sources. A brain that genuinely moves with the cut — and editing notes that point to mm:ss and predicted lift. What we learned
The neuroscience of attention and reward, multimodal encoder plumbing, production RAG (retrieval quality is everything), and serverless GPU orchestration.
What’s next
Real-time scoring while you edit, more audience personas, A/B testing at scale, and a fine-tuned encoder trained on first-party engagement data.
Built With
- ffmpeg.wasm
- huggingface
- llama-3.2
- meta-tribe-v2
- modal
- next.js
- openai
- pgvector
- python
- react
- react-three-fiber
- supabase
- tailwindcss
- three.js
- typescript
- v-jepa2
- vercel
- wav2vec-bert
Log in or sign up for Devpost to join the conversation.