An intelligent DJ application that reads the room's energy through your camera and microphone, then curates the perfect musical vibe in real-time using Google's Gemini AI.
DeepJ bridges the gap between human intuition and AI-powered music curation. Traditional music apps require manual selection, while DeepJ:
- Reads the Room: Uses computer vision and audio analysis to detect mood and energy levels
- Adapts in Real-Time: Continuously adjusts music selection based on the environment
- Two Music Modes: Choose between pre-recorded tracks or AI-generated live music
- Perfect for Any Setting: Parties, study sessions, work environments, or chill hangouts
The project demonstrates the power of multimodal AI by combining video, audio, and generative music APIs to create an autonomous DJ experience.
- Node.js (v18+)
- A modern web browser with camera/microphone permissions
- Google Gemini API key
# Clone the repository
git clone https://github.com/yourusername/DeepJ.git
cd DeepJ
# Install dependencies
npm install
# Start development server
npm run dev- Create a
.envfile:
GEMINI_API_KEY=your_api_key_here- Update
services/geminiService.tsandcomponents/DJInterface.tsxto use:
const apiKey = import.meta.env.VITE_GEMINI_API_KEY;User Camera/Mic โ Gemini Live API โ Mood Detection โ Music Selection
โ
Track Queue โ Genre Selection โ Song Database
โ
LiveMusicHelper โ Lyria API โ AI-Generated Music
- App.tsx: Main app orchestrator with three screens (intro, DJ interface, end)
- DJInterface.tsx: Primary UI component managing playback, camera feed, and user controls
- EndSession.tsx: Session completion screen with restart option
Two-Stage Mood Detection Pipeline:
-
Stage 1 - Live API: Real-time video/audio analysis for mood detection
- Uses
gemini-2.5-flash-native-audio-previewmodel - Streams camera feed and microphone input
- Calls
reportMoodfunction when confident (>70%) - Detects:
chilling,focusing,partying,happy,sad
- Uses
-
Stage 2 - Standard API: Genre selection based on detected mood
- Uses
gemini-2.5-flashmodel for text generation - Maps mood + energy level to appropriate genres
- Throttled to once per 30 seconds to prevent spam
- Uses
Reconnection Logic: Automatic session recovery with exponential backoff
Manages Google's Lyria Realtime API for AI-generated music:
- Prompt-Based Generation: Weighted prompts control musical style
- Adaptive Playback: Adjusts in real-time based on mood changes
- Audio Stream Management: Buffers and plays generated audio chunks
- Event System: Emits playback state changes and errors
Doubly Linked List Implementation:
- Bidirectional navigation (prev/next)
- Cursor-based current track tracking
- Dynamic enqueuing of AI-suggested tracks
MusicSuggestion: mood, energyLevel, trackFilename
Prompt: promptId, text, weight, color, cc (control code)
PlaybackState: stopped | playing | loading | paused- Uses pre-recorded MP3 files from music database
- AI selects tracks from 10 genres: rock, pop, rap, indie pop, classical, country, jazz, indie rock, metal, electronic
- Managed through queue system with skip forward/backward
-
Real-time music generation via Lyria API
-
Mood-based prompt sets:
- Chilling: Chillwave, Bossa Nova, Lush Strings, Neo Soul
- Focusing: Sparkling Arpeggios, Chillwave, Trip Hop
- Partying: Drum and Bass, Dubstep, K-Pop, Punchy Kick
- Happy: Funk, K-Pop, Chiptune, Neo Soul
- Sad: Shoegaze, Post Punk, Trip Hop, Lush Strings
-
Dynamic weight adjustment based on energy level
-
Seamless transitions between moods
DeepJ/
โโโ components/
โ โโโ DJInterface.tsx # Main DJ interface
โ โโโ EndSession.tsx # End screen
โ โโโ VolumeControl.tsx # Volume slider
โ โโโ ProgressBar.tsx # Playback progress
โโโ services/
โ โโโ geminiService.ts # Gemini AI integration
โโโ lib/
โ โโโ LiveMusicHelper.ts # Lyria API wrapper
โ โโโ throttle.ts # Rate limiting utility
โ โโโ audio.ts # Audio processing utilities
โโโ types.ts # TypeScript definitions
โโโ music/
โ โโโ music_data.json # Song database
โโโ App.tsx # Main app component
โโโ index.html # Entry point
-
Gemini Live API (
v1alpha)- Model:
gemini-2.5-flash-native-audio-preview-09-2025 - Purpose: Real-time mood detection from video/audio
- Model:
-
Gemini Standard API (
v1)- Model:
gemini-2.5-flash - Purpose: Genre selection via function calling
- Model:
-
Lyria Realtime API (
v1alpha)- Model:
lyria-realtime-exp - Purpose: AI music generation
- Model:
- Throttling: Song selection limited to once per 30 seconds
- Audio Buffering: 2-second buffer for smooth playback
- Video Sampling: 1 frame per second for mood analysis
- Reconnection Strategy: Max 5 attempts with 2-second delays
- Real-time camera feed as background
- Mood visualization with energy level indicator
- Active prompt display showing AI music parameters
- Smooth transitions between tracks and modes
- Responsive controls for play/pause, skip, volume
- Status indicators for AI connection and playback state
- Music database requires manual curation
- Live music mode requires stable internet connection
- Camera permissions required for mood detection
- Browser compatibility: Chrome/Edge recommended
- Secure API key management
- User-configurable prompt sets
- Music taste learning over time
- Multi-room support
- Spotify/Apple Music integration
- Guest mood voting system
- Analytics dashboard
MIT License - feel free to use and modify for your own projects!
- Google Gemini AI for multimodal analysis
- Lyria API for music generation
- Tailwind CSS for styling
- Framer Motion for animations
