Timbre - Real-Time AI Music Generation for Videos

Create adaptive cinematic soundtracks powered by Google Lyria.

🎼 What Is Timbre?

Timbre is a multimodal, adaptive scoring engine that creates real-time, context-aware music for your videos. By reading visual cues and spoken dialogue, Timbre identifies the exact vibe of every moment and uses Google Lyria to generate a perfectly synchronized soundtrack that evolves with your story.

✨ Key Features

🎵 Real-time music generation using Google Lyria's streaming API
📽️ Automatic scene segmentation (PySceneDetect + OpenCV)
🧠 Multimodal LLM analysis for mood, emotion, pacing
⚡ Low-latency WebSocket audio streaming with custom buffering layer
🛡️ Fault-tolerant session manager using Redis + resumable Lyria sessions

🏗️ System Overview

⚙️ How It Works

Analysis Phase

Video Upload - Client sends video file via multipart upload
Parallel Processing - Concurrent frame extraction (OpenCV + PySceneDetect) and audio transcription
LLM Musical Script - AI analyzes visual/audio content to generate tempo, key, mood timeline
Session Creation - Redis stores analysis results and streaming configuration

Streaming Phase

WebSocket Connection - Real-time bidirectional communication established
Lyria Integration - Google's RT API receives musical prompts and streams audio
Dynamic Adaptation - System adjusts musical parameters based on scene changes
Seamless Delivery - 2-second audio chunks with smooth crossfading

🧠 Engineering Challenges & Solutions

Inference Speed: Switched to Groq and parallelized scene analysis because waiting for LLMs is boring.
Lyria Stability: Engineered a custom heartbeat and reconnection system to keep the Google Lyria WebSocket alive during long sessions.
Audio Artifacts: Wrote a crossfading algorithm to smooth out jarring "pops" between generated audio chunks.
Redis Latency: Implemented pipelining and connection pooling to prevent bottlenecks during high-frequency state updates.
Error Recovery: Added automatic retries and state migration so a single network blip doesn't crash the whole stream.

🛠️ Tech Stack

Backend

FastAPI - High-performance async API framework
Python 3.13 - Latest language features and performance
Redis - Session state and real-time data management
PySceneDetect - Intelligent video scene analysis
OpenCV - Computer vision and frame processing
Google Lyria RT - Real-time music generation
WebSockets - Low-latency bidirectional communication

Frontend

Next.js 14 - React framework with App Router
React 19 - Latest React features and concurrent rendering
AWS Amplify - Authentication (Cognito) and deployment
Tailwind CSS - Utility-first styling
Framer Motion - Smooth animations and transitions
TypeScript - Type-safe development

Infrastructure

Docker Compose - Containerized development environment
Turborepo - Monorepo build system and caching
pnpm - Fast, disk space efficient package manager
UV - Ultra-fast Python package installer and resolver

📁 Project Structure

timbre/
├── apps/
│   ├── backend/                # FastAPI application
│   │   ├── service/            # Core business logic
│   │   │   ├── auth/           # Authentication services
│   │   │   ├── global_eval/    # Video analysis engine
│   │   │   ├── lyria/          # Lyria API integration
│   │   │   └── video/          # Video processing utilities
│   │   ├── utils/              # Shared utilities
│   │   │   ├── audio/          # Audio processing
│   │   │   ├── video/          # Video manipulation
│   │   │   ├── llm/            # LLM integration & prompts
│   │   │   └── helper/         # Common utilities
│   │   ├── models/             # Data models
│   │   └── tests/              # Test suite
│   └── frontend/               # Next.js application
│       ├── src/app/            # App Router pages
│       ├── src/components/     # React components
│       └── src/lib/            # Frontend utilities
├── packages/                   # Shared packages
│   ├── eslint-config/          # Linting configuration
│   └── typescript-config/      # TypeScript settings
└── docker-compose.yml         # Development environment

🚀 Installation & Running Locally

Prerequisites

Docker & Docker Compose
Node.js 18+ and pnpm
Python 3.13+ and uv
Google Cloud Project with Lyria API access

Quick Start

# Clone the repository
git clone https://github.com/saat-sy/timbre.git
cd timbre

# Install Node.js dependencies
pnpm install

# Set up environment variables
cp apps/backend/.env.example apps/backend/.env
cp apps/frontend/.env.example apps/frontend/.env
# Configure your API keys and database URLs

# Start the development environment
docker-compose up -d

# Run both frontend and backend
pnpm dev

🎉 That's it!

🗺️ Roadmap

Core Features

Export mode - Export an MP4 file with background music
Advanced scene detection - More advanced emotion detection to understand the scene in depth
Multi-character emotional arcs - Track and score individual character journeys

DevOps & Production

CI/CD Pipeline - GitHub Actions for automated testing and deployment
Production deployment - Live staging and production environments
Monitoring & observability - Error tracking, performance metrics, and alerting
Load testing - Performance validation under high concurrent usage
Rate limiting - DDoS protection and API throttling

Testing & Quality

Frontend test suite - React component and integration testing
Backend test expansion - Increased unit test coverage and API testing
End-to-end tests - Full user workflow automation
Security scanning - Automated vulnerability detection
Performance benchmarking - Latency and throughput optimization

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Fork the project
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📬 Contact & About Me

Saatwik Yajaman - MSCS Student at USC
Building the future of AI-powered creative tools.

📧 Email: yajaman@usc.edu
💼 LinkedIn: linkedin.com/in/saatwik-yajaman
🐙 GitHub: @saat-sy

Always excited to discuss AI, music technology, and creative engineering!

Name		Name	Last commit message	Last commit date
Latest commit History 164 Commits
.github		.github
apps		apps
assets		assets
packages		packages
.env.example		.env.example
.gitignore		.gitignore
.npmrc		.npmrc
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
turbo.json		turbo.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Timbre - Real-Time AI Music Generation for Videos

🎼 What Is Timbre?

✨ Key Features

🏗️ System Overview

⚙️ How It Works

Analysis Phase

Streaming Phase

🧠 Engineering Challenges & Solutions

🛠️ Tech Stack

Backend

Frontend

Infrastructure

📁 Project Structure

🚀 Installation & Running Locally

Prerequisites

Quick Start

🗺️ Roadmap

Core Features

DevOps & Production

Testing & Quality

📄 License

🤝 Contributing

📬 Contact & About Me

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Timbre - Real-Time AI Music Generation for Videos

🎼 What Is Timbre?

✨ Key Features

🏗️ System Overview

⚙️ How It Works

Analysis Phase

Streaming Phase

🧠 Engineering Challenges & Solutions

🛠️ Tech Stack

Backend

Frontend

Infrastructure

📁 Project Structure

🚀 Installation & Running Locally

Prerequisites

Quick Start

🗺️ Roadmap

Core Features

DevOps & Production

Testing & Quality

📄 License

🤝 Contributing

📬 Contact & About Me

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages