Claude Yap transforms PDFs and research topics into engaging, multi-voice podcasts using Claude AI and advanced text-to-speech technology. The app creates natural-sounding conversations in various formats, with custom voices for each speaker and automatically-generated cover art.
- Intelligent PDF Analysis: Upload PDFs and Claude will analyze, summarize, and transform them into conversational podcasts
- Multiple Podcast Formats:
- Podcast Style: Traditional host/guest conversational format
- Debate Style: Two perspectives debating topics from the source material
- Duck Mode: Educational teacher/student dialogue format
- Multi-Speaker Natural Voices: Uses Cartesia TTS API to generate distinctly different voices for each speaker
- Automated Cover Art: Creates custom podcast covers using Claude's artifact generation or Stable Diffusion
- Advanced Audio Player: Professional-grade player with keyboard shortcuts, playback speed control, and progress tracking
- Video Conference Integration: Discuss podcasts via video call through Tavus API integration
- Real-Time Progress Tracking: Monitor podcast generation progress with detailed status updates
- Node.js (v16+)
- Python 3.9+
- Claude API key (Anthropic)
- Cartesia API key (for text-to-speech)
- Hugging Face API token (for cover art generation)
- Tavus API credentials (for video conference)
- Clone the repository
git clone https://github.com/yourusername/claude-yap.git
cd claude-yap- Set up the backend
cd backend
pip install -r requirements.txt
# Create .env file using example template
cp .env.example .env
# Fill in your API keys in the .env file- Set up the frontend
cd ../frontend
npm install- Start the backend server
cd backend
python app.py- Start the frontend development server
cd ../frontend
npm run dev- Open your browser and navigate to
http://localhost:5173
- React with React Router
- Framer Motion for animations
- Advanced HTML5 audio player
- Python with Flask
- Claude 3.7 Sonnet and Claude 3 Opus for content generation
- Cartesia API for high-quality multi-speaker text-to-speech
- Claude artifacts tool and Hugging Face Stable Diffusion for cover art
- Tavus API for video conference integration
- FFmpeg for audio processing
- Upload Content: Submit PDFs for analysis or provide a research topic
- Select Format: Choose between podcast, debate, or educational "duck" mode
- Script Generation: Claude analyzes content and creates a conversational script
- Voice Assignment: The system automatically assigns distinct voices to each speaker
- Audio Generation: Cartesia TTS converts the script into natural-sounding multi-voice audio
- Cover Art Creation: Automatically generates custom cover art for each podcast
- Playback & Sharing: Listen to your podcast with advanced playback controls
- AI Video Discussion: Have 1-on-1 video calls with AI personalities to discuss your podcasts
- Keyboard shortcuts for playback control (space, arrows, number keys)
- Variable playback speed (0.5x to 2x)
- Jump forward/backward functionality
Discuss podcasts with AI personalities through Tavus API video conference integration
- Track listened/unlistened podcasts
- Rename and organize your podcast collection
- Frontend: React SPA with responsive UI and advanced audio player
- Backend: Flask API with Claude integration, audio processing, and data management
- Claude Integration: Uses both Claude 3.7 Sonnet and Claude 3 Opus for different tasks
- Script Processing: Intelligently parses scripts to identify speakers and dialogue
- TTS Engine: Uses Cartesia's advanced TTS API for natural multi-speaker audio
- Cover Art Generation: Uses either Claude artifacts (SVG) or Stable Diffusion (PNG)