Inspiration
In a world where communication shapes careers, relationships, and opportunities, millions struggle to articulate their ideas confidently—whether in interviews, presentations, or everyday conversations. Our own experiences with public speaking anxiety and awkward interviews inspired us to create GeeseTalk: an AI-powered coach that makes practicing communication skills accessible, personalized, and fun. We envisioned a platform that combines the gamified approach of Duolingo with the supportive environment of Toastmasters, democratizing high-quality skill-building for everyone.
What It Does
GeeseTalk is an interactive platform that helps users practice and master essential communication skills:
- Skill Paths: Curated roadmaps like “Nail Your Next Job Interview” or “Master Persuasive Debating,” broken into bite-sized, scenario-based lessons.
- AI Conversation Partners: Mock conversations or speeches with AI personas (e.g., hiring manager, debate opponent, or casual friend).
- Real-Time Feedback: Multimodal analysis of clarity, filler words, tone, eye contact, and posture—complete with actionable tips.
- Progress Dashboard: Users track their improvement over time, seeing metrics like confidence score, pacing, and vocabulary diversity.
How We Built It
Frontend (Next.js + React)
- Next.js for server-side rendering, static site generation, and API routes.
- Tailwind CSS for utility-first styling.
- React Query for efficient data fetching and state management.
- Turbopack for lightning-fast local development builds and hot module reloading.
Backend (Flask + Prisma + Neon)
- Flask in Python to serve API endpoints and handle real-time communication via WebSocket.
- PostgreSQL (through Neon serverless Postgres) for user data and lesson management, using Prisma as an ORM for type-safe queries.
AI/ML Components
- Google Vertex AI (Gemini models) for video/audio analysis—detecting speech clarity, body language, and emotional tone.
- ElevenLabs for realistic AI-generated speech, simulating lifelike conversation partners.
DevOps & Infrastructure
- Vercel for automated deployments, global edge caching, and serverless functions.
- AWS S3 for secure and scalable storage of user-uploaded video and audio files.
Challenges We Ran Into
- Real-Time Multimodal Analysis: Balancing quick feedback with high-fidelity video/audio analysis proved tricky. We had to optimize buffering, encoding, and model inference to minimize latency.
- Personalized Feedback: Tuning AI models to provide actionable insights (e.g., “Speak slower by 20%”) rather than generic tips required careful prompt engineering.
- Scalable Architecture: Handling video files, real-time feedback, and simultaneous AI inferences is resource-intensive. Finding the right balance between performance and cost was a constant challenge.
- User Experience: Ensuring a friendly interface that doesn’t overwhelm users with too much information at once was a key design hurdle.
Accomplishments That We're Proud Of
- Immersive Roleplay: Users can practice tough interviews or debates in a safe environment, replaying scenarios until they feel confident.
- Multimodal Feedback Loops: Combining speech, body language, and emotional tone analysis into a single, easy-to-digest report.
- Scalable and Fast: Leveraging Next.js (for SSR) and Turbopack significantly boosted performance and reduced build times, enabling us to iterate quickly.
- Democratizing Communication Coaching: We’re proud to offer a free tier so more people can access high-quality coaching without prohibitive costs.
What We Learned
- The Power of Immediate Feedback: When learners receive suggestions the moment they make a mistake, they’re more likely to improve quickly.
- Iterative UX Design: Conducting small user tests allowed us to refine the interface and help users focus on what really matters—practical skill-building.
- Importance of Edge Cases: Not all conversations follow a predictable pattern. We had to train and prompt our AI models to handle unexpected user responses gracefully.
- Optimization vs. Accessibility: Balancing AI complexity with fast load times and user-friendly experiences is an ongoing trade-off.
What’s Next for GeeseTalk
- Expanded Lesson Types: Adding niche scenarios like investor pitches, crisis communication, and podcast hosting.
- Community Challenges: Hosting live debates, collaborative storytelling, and weekly themed practice sessions.
- HR & Corporate Integrations: Offering team upskilling programs and analytics for organizations to track professional growth.
- Personalized AI Coach: Developing adaptive lessons that evolve with each user’s progress, focusing on areas that need the most improvement.
- Mobile AR Features: Exploring augmented reality overlays for posture correction and visual cues during live practice.
GeeseTalk aims to redefine how people approach communication—whether you’re prepping for a TED Talk, trying to impress a hiring manager, or simply want to speak more confidently in everyday life. By blending cutting-edge AI, engaging lesson formats, and user-centric design, GeeseTalk bridges the gap between anxiety and confidence, ensuring everyone has the voice they need to be heard.
Built With
- 11labs
- flask
- gemini
- google-gemini-vertex-ai-api
- javascript
- next.js
- prisma
- python
- react
- tailwindcss
Log in or sign up for Devpost to join the conversation.