Inspiration
Branding builds identity. Whether you're a high school athlete creating a recruitment tape or a UofTHacks trying to promote sponsors and their next hackathon, that perfect video tells your story.
But here's the problem: professional video editing is intimidating and expensive. Athletes have game footage but don't know how to stitch them together and promote their brand sponsors. Organizations want to promote their events, but can't afford a videographer.
Meanwhile, professional sports broadcasts make multi-angle switching look seamless, but that requires expensive equipment and trained editors.
We asked ourselves: What if AI could do this automatically? What if anyone with footage could create broadcast-quality recaps? And what if those videos could actually generate revenue through seamless sponsorships?
That's why we built Anchor. We want to democratize professional video editing and help anyone amplify their identity through powerful visual storytelling.
What it does
Anchor transforms multi-angle footage into broadcast-quality reels with integrated advertisements—automatically, with your prompt.
Here's how it works:
Upload Your Footage: Capture your event from multiple angles. Upload videos to Anchor with different camera shots, and tell us what type of event it is.
Add Your Personal Touch: Upload your team anthem, graduation song, or favorite track. This is your identity—your music makes the reel uniquely yours.
Tell Us What You Want: Just describe your vision in plain English:
- "Show me my best moments" - Anchor finds YOU across all the footage
- "Create a high-energy highlight reel" - Get an intense, action-packed edit
- "Track player #23, focus on scoring plays" - Follow a specific person, emphasize key moments
- "Make me a 30-second teaser for YouTube" - Get the perfect length for social media
Watch the Magic Happen: Anchor's AI takes over:
- Automatically syncs all your videos to the same timeline (no more "which angle had the best shot?")
- Intelligently switches between angles like a professional broadcast director
- Syncs scene changes to the beat drops of your music
- Integrates sponsor ads seamlessly across the screen, not annoying popups
How we built it
The system processes footage through a six-stage pipeline:
Upload & Sync: Users upload videos directly to AWS S3 using presigned URLs. Videos are time-aligned using device metadata, then refined with librosa audio fingerprinting.
TwelveLabs Analysis: TwelveLabs indexes each video using Marengo 3.0 for visual/audio analysis and Pegasus 1.2 for embedding generation. Users query with natural language and TwelveLabs semantic search returns timestamps to relevant clips. To speed up TwelveLabs processing, we splice up the video clip and make multiple API calls to process all parts of the video in parallel.
Intelligent Editing: For every 2-second interval, the system scores each camera angle by combining embedding similarity to the user's desired vibe with event-specific rules. The highest-scoring angle is selected with minimum 4-second holds between switches. If music is uploaded, librosa detects beats and snaps cuts to the nearest beat drop.
Async Processing: Celery workers with Redis handle long-running jobs. Supabase Realtime broadcasts processing status via WebSocket so users see live progress.
Video Assembly: FFmpeg renders the final output by cutting clips using TwelveLabs timestamps, applying zoom on high-action moments, concatenating with crossfades, and mixing audio with intelligent ducking during speech.
Native Sponsorships: Google Veo generates product videos matching the footage's visual style. These are inserted at natural transition points using FFmpeg crossfades. Products are fetched from connected Shopify stores via OAuth.
Challenges we ran into
- Parallelizing and batch processing video processing for TwelveLabs
- Compressing audio and optimizing upload speed to AWS S3 Bucket
- Making the editing/transitions not bad
- Configuring Celery workers and Redis for async video processing—managing worker memory limits for large FFmpeg jobs and preventing race conditions in real-time status updates
- Running into rate limit issues with Google Veo (we're capped at 10 video generations per day)
Accomplishments that we're proud of
We all tried out pair programming and it worked great because we could bounce ideas off each other which caught many bugs and improved our time complexity significantly.
We're also proud of being able to create a cohesive developmental environment, resulting in no major merge conflicts, and 100.00% developer up-time. Through this, we were able to an app that both looks great and works well.
What we learned
- TwelveLabs
- Batching + parallelization processes at a large scale
- FFmpeg for video editing and splicing videos together
- Combined with video generation from Veo
Audio syncing using librosa audio fingerprinting
What's next for Anchor
Scalability so we can launch to the public
Optimizing editing + video processing



Log in or sign up for Devpost to join the conversation.