Create adaptive cinematic soundtracks powered by Google Lyria.
π¬Watch Live Demo β’ πTry it Now
Timbre is a multimodal, adaptive scoring engine that creates real-time, context-aware music for your videos. By reading visual cues and spoken dialogue, Timbre identifies the exact vibe of every moment and uses Google Lyria to generate a perfectly synchronized soundtrack that evolves with your story.
- π΅ Real-time music generation using Google Lyria's streaming API
- π½οΈ Automatic scene segmentation (PySceneDetect + OpenCV)
- π§ Multimodal LLM analysis for mood, emotion, pacing
- β‘ Low-latency WebSocket audio streaming with custom buffering layer
- π‘οΈ Fault-tolerant session manager using Redis + resumable Lyria sessions
- Video Upload - Client sends video file via multipart upload
- Parallel Processing - Concurrent frame extraction (OpenCV + PySceneDetect) and audio transcription
- LLM Musical Script - AI analyzes visual/audio content to generate tempo, key, mood timeline
- Session Creation - Redis stores analysis results and streaming configuration
- WebSocket Connection - Real-time bidirectional communication established
- Lyria Integration - Google's RT API receives musical prompts and streams audio
- Dynamic Adaptation - System adjusts musical parameters based on scene changes
- Seamless Delivery - 2-second audio chunks with smooth crossfading
- Inference Speed: Switched to Groq and parallelized scene analysis because waiting for LLMs is boring.
- Lyria Stability: Engineered a custom heartbeat and reconnection system to keep the Google Lyria WebSocket alive during long sessions.
- Audio Artifacts: Wrote a crossfading algorithm to smooth out jarring "pops" between generated audio chunks.
- Redis Latency: Implemented pipelining and connection pooling to prevent bottlenecks during high-frequency state updates.
- Error Recovery: Added automatic retries and state migration so a single network blip doesn't crash the whole stream.
- FastAPI - High-performance async API framework
- Python 3.13 - Latest language features and performance
- Redis - Session state and real-time data management
- PySceneDetect - Intelligent video scene analysis
- OpenCV - Computer vision and frame processing
- Google Lyria RT - Real-time music generation
- WebSockets - Low-latency bidirectional communication
- Next.js 14 - React framework with App Router
- React 19 - Latest React features and concurrent rendering
- AWS Amplify - Authentication (Cognito) and deployment
- Tailwind CSS - Utility-first styling
- Framer Motion - Smooth animations and transitions
- TypeScript - Type-safe development
- Docker Compose - Containerized development environment
- Turborepo - Monorepo build system and caching
- pnpm - Fast, disk space efficient package manager
- UV - Ultra-fast Python package installer and resolver
timbre/
βββ apps/
β βββ backend/ # FastAPI application
β β βββ service/ # Core business logic
β β β βββ auth/ # Authentication services
β β β βββ global_eval/ # Video analysis engine
β β β βββ lyria/ # Lyria API integration
β β β βββ video/ # Video processing utilities
β β βββ utils/ # Shared utilities
β β β βββ audio/ # Audio processing
β β β βββ video/ # Video manipulation
β β β βββ llm/ # LLM integration & prompts
β β β βββ helper/ # Common utilities
β β βββ models/ # Data models
β β βββ tests/ # Test suite
β βββ frontend/ # Next.js application
β βββ src/app/ # App Router pages
β βββ src/components/ # React components
β βββ src/lib/ # Frontend utilities
βββ packages/ # Shared packages
β βββ eslint-config/ # Linting configuration
β βββ typescript-config/ # TypeScript settings
βββ docker-compose.yml # Development environment
- Docker & Docker Compose
- Node.js 18+ and pnpm
- Python 3.13+ and uv
- Google Cloud Project with Lyria API access
# Clone the repository
git clone https://github.com/saat-sy/timbre.git
cd timbre
# Install Node.js dependencies
pnpm install
# Set up environment variables
cp apps/backend/.env.example apps/backend/.env
cp apps/frontend/.env.example apps/frontend/.env
# Configure your API keys and database URLs
# Start the development environment
docker-compose up -d
# Run both frontend and backend
pnpm devπ That's it!
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
- Export mode - Export an MP4 file with background music
- Advanced scene detection - More advanced emotion detection to understand the scene in depth
- Multi-character emotional arcs - Track and score individual character journeys
- CI/CD Pipeline - GitHub Actions for automated testing and deployment
- Production deployment - Live staging and production environments
- Monitoring & observability - Error tracking, performance metrics, and alerting
- Load testing - Performance validation under high concurrent usage
- Rate limiting - DDoS protection and API throttling
- Frontend test suite - React component and integration testing
- Backend test expansion - Increased unit test coverage and API testing
- End-to-end tests - Full user workflow automation
- Security scanning - Automated vulnerability detection
- Performance benchmarking - Latency and throughput optimization
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the project
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Saatwik Yajaman - MSCS Student at USC
Building the future of AI-powered creative tools.
- π§ Email: yajaman@usc.edu
- πΌ LinkedIn: linkedin.com/in/saatwik-yajaman
- π GitHub: @saat-sy
Always excited to discuss AI, music technology, and creative engineering!
