BetterThanLeet - Your AI Technical Interviewer, Powered by Gemini

"LeetCode can't ask follow-up questions. Alexis can."

AI-powered mock interview platform with voice-based FAANG-style interviewers for coding, system design, and behavioral prep - featuring real-time code execution, live architecture diagrams, speech analytics, and progress tracking.

Inspiration

Technical interviews are broken. Candidates grind hundreds of LeetCode problems in isolation, but real interviews require communication - explaining your thought process, handling follow-up questions, whiteboarding system designs under pressure, and telling compelling stories about past experience.

We asked ourselves: What if you could practice with an AI interviewer that actually listens, responds, challenges you, and even draws your architecture diagrams - just like sitting across from a real FAANG interviewer?

LeetCode tests if you can solve algorithms. BetterThanLeet tests if you can communicate your solutions under pressure, handle curveballs, and think on your feet. The gap between "can solve it alone at home" and "can nail it in a live interview" is enormous - and that's the gap we close.

Gemini's Live API with native audio made this possible for the first time. Real-time voice with natural interruption handling, autonomous tool calling (the AI decides when to run your code), and seamless turn-taking - this is the technology the interview prep space was waiting for.

What It Does

BetterThanLeet provides three complete interview experiences through a single platform, all powered by Gemini:

1. Coding Interviews

Real-time voice conversation with Alexis while you write code in a Monaco Editor
Autonomous tool calling - Gemini decides when to run your tests, read your code, and give feedback without you asking
Sandboxed code execution via Daytona SDK (Python, JavaScript, TypeScript)
500+ problems from NeetCode 150 + custom LeetCode import
Integrity tracking (tab switches, paste detection) with a 0-100 integrity score
AutoFix powered by Gemini 3 Flash - one-click error resolution with automatic dependency installation
Solution editorials with time/space complexity analysis after each session
AI Interviewer Personas: Friendly, Tough, Fast-Paced, or Detail-Oriented styles

2. System Design Interviews

FAANG-style evaluative interviewer - Alexis challenges your decisions ("Why Kafka over RabbitMQ?"), doesn't help you build
Live Mermaid diagram generation - describe components by voice, Alexis draws them on an interactive canvas as your whiteboard
Interactive diagram toolbar: undo/redo, zoom, export PNG/SVG, copy source, live source editor
18 curated topics (URL Shortener, Chat App, Payment System, Video Streaming, Ride-Sharing, etc.)
Component checklist tracking expected architectural elements
Phase management: Requirements -> High-Level Design -> Deep Dive -> Scaling -> Closing

3. Behavioral Interviews

STAR-method guided evaluation across 20 topic categories
Voice-only focused interface for authentic conversational practice
Categories include Leadership, Conflict Resolution, Customer Focus, Decision-Making, Handling Pressure, and more
Structured scoring on Situation, Task, Action, and Result dimensions

Cross-Cutting Features

Speech Analytics Dashboard: Filler word detection (um, uh, basically...), words-per-minute tracking, talk ratio analysis, pause detection, vocabulary diversity, and composite confidence score (0-100)
Progress Dashboard: Interview readiness score, category performance across all modes, XP/streak system, trend analysis, and personalized recommendations for what to practice next
Company Interview Guides: Curated prep guides for Google, Meta, Amazon, Apple, Microsoft, Netflix, Uber, and Stripe with round-by-round breakdowns, focus areas, tips, and common topics
Countdown Timer: Configurable interview duration with audio warnings at 5 minutes

How We Built It

Gemini Integration - The Core Engine

BetterThanLeet is built entirely around two Gemini models:

Gemini 2.5 Flash Native Audio (Live API) powers the real-time voice interview experience. We connect via raw WebSocket to wss://generativelanguage.googleapis.com and send/receive PCM audio chunks. The setup includes:

Voice Activity Detection with natural turn-taking (700ms silence threshold)
Interruption handling (START_OF_ACTIVITY_INTERRUPTS) - candidates can interrupt mid-sentence
Input/output audio transcription for live transcript
Proactive function calling - Gemini autonomously decides when to run code, read files, or end the interview
Audio processing: 16kHz input with biquad noise filtering (80Hz high-pass, 8kHz low-pass), 24kHz output playback

Gemini 3 Flash (REST API) powers the analysis layer:

Comprehensive interview report generation with hiring recommendations
AutoFix code suggestions with automatic dependency detection
Code quality, security, and complexity analysis
Practice mode coaching feedback

Three distinct tool sets enable mode-specific AI behavior:

Coding tools: run_code, read_candidate_code, get_current_problem, end_interview
System design tools: get_interview_mode, read_transcript, end_interview
Behavioral tools: evaluate_response, transition_topic, end_interview

Architecture Diagram

Gemini WebSocket Connection Flow

WebSocket Flow

State Management

Tech Stack

Layer	Technology	Purpose
AI Voice	Gemini 2.5 Flash Native Audio (Live API)	Real-time voice interviews via WebSocket
AI Analysis	Gemini 3 Flash (REST API)	Reports, AutoFix, code quality analysis
Frontend	Next.js 16, React 19, TypeScript 5	App framework with App Router
Styling	Tailwind CSS 4, Radix UI	Responsive design with accessible components
Code Editor	Monaco Editor	VS Code-quality editing in browser
Diagrams	Mermaid 11.12	Live architecture diagram rendering
State	Zustand 5 with persist middleware	Client state with capped localStorage
Sandbox	Daytona SDK	Isolated Docker containers for code execution
Code Review	CodeRabbit	Automated code quality analysis
Monitoring	Sentry	Production error tracking
Validation	Zod 4	Runtime type safety
Testing	Vitest, Testing Library	93+ unit and integration tests

Challenges We Ran Into

Non-ASCII Character Rejection

The Gemini Native Audio model silently rejects system instructions containing non-ASCII characters (em-dashes, arrows, emojis) with a cryptic WebSocket close code 1007: "Request contains an invalid argument." No error message, no hint. We spent hours debugging before discovering that a single em-dash character (U+2014) in our 14KB system prompt could crash the entire connection. We had to scan every prompt file character-by-character and replace all non-ASCII with ASCII equivalents.

Natural Turn-Taking

Building a conversation that feels real required careful tuning of Voice Activity Detection. Too sensitive and the AI stops mid-sentence from keyboard clicks. Too insensitive and interruptions don't register. We landed on 700ms silence duration with low start/end sensitivity, plus biquad noise filters (80Hz high-pass, 8kHz low-pass) to strip out ambient noise before it hits the VAD.

Real-Time Diagram Generation from Voice

Extracting valid Mermaid syntax from Gemini's streaming text responses was harder than expected. The AI sometimes outputs partial diagrams, invalid syntax, or mixes diagram code with conversational text mid-stream. We built a custom parser that validates Mermaid syntax before rendering, handles partial updates gracefully, and maintains an undo/redo history so candidates can recover from bad renders.

localStorage Overflow

Unbounded transcript arrays caused localStorage to silently fail after 30+ minute interview sessions with zero error messages. We implemented caps across all Zustand stores (transcripts: 500 entries, console output: 200, test results: 100) and carefully excluded large arrays from persistence via partialize(). The behavioral store doesn't persist transcripts at all since they can grow very large in voice-heavy sessions.

Speech Analytics False Positives

Our initial filler word detection flagged common legitimate words ("so," "right," "like") as filler, producing inflated and inaccurate metrics that frustrated users. We narrowed detection to 9 high-confidence patterns and fixed pause detection to measure actual candidate thinking time (gap between AI finishing and candidate starting) rather than AI response latency, which was being incorrectly attributed to the candidate.

Making the AI an Interviewer, Not a Helper

The hardest design challenge was behavioral. Our system design AI initially acted as a collaborative helper - presenting starting diagrams and guiding candidates through architecture. Real FAANG interviewers present the problem and wait. We rewrote the entire persona to be evaluative: Alexis challenges decisions ("Why Kafka over RabbitMQ?"), probes failure modes ("What happens if this node goes down?"), and only draws diagrams reflecting what the candidate has described - never proposing architecture on its own.

Accomplishments That We're Proud Of

Three complete interview modes (coding, system design, behavioral) in a single platform powered by Gemini - no competitor offers all three with voice AI
Real-time Mermaid diagram generation from voice - describe "add a load balancer in front of the API servers" and watch it appear on the canvas. Industry first for interview prep
Natural interruption handling - candidates interrupt mid-sentence and the AI seamlessly pivots, just like a real conversation
Autonomous tool calling - Gemini proactively decides when to run tests, read code, and give feedback. The AI drives the interview, not the UI
Comprehensive speech analytics - filler words, pace, confidence, vocabulary diversity, talk ratio. Metrics that actually help candidates improve their communication skills, not just their algorithms
FAANG-style evaluative AI - Alexis challenges, probes, and pushes back. It's not a friendly tutor - it's the interviewer you'll face at Google
500+ problems with solution editorials, company-specific guides for 8 companies, 18 system design topics, 20 behavioral categories
Sub-second voice latency - the conversation genuinely feels natural, not like talking to a chatbot with awkward pauses

What We Learned

Gemini's Native Audio changes the paradigm - the voice synthesis quality and natural turn-taking make voice-first AI applications genuinely viable for complex, extended interactions. This isn't a voice assistant answering quick questions - it's a 45-minute technical conversation that feels real.
Tool calling in voice mode is the killer feature - having Gemini autonomously decide when to run code or read files creates moments that feel magical. The AI doesn't just talk about your code - it runs it, sees the output, and gives contextual feedback. Candidates regularly forget they're talking to an AI.
Prompt engineering for voice is a different discipline - prompts that work perfectly for text chat fail in voice mode. We learned to write conversational, short-response prompts (1-2 sentences max), handle interruption recovery, and discovered the hard way that non-ASCII characters silently break everything.
State management requires discipline at scale - three interview modes each with their own Zustand store, transcript, persistence strategy, and array caps. One unbounded array can silently corrupt an entire localStorage partition.
Interview prep is about communication, not algorithms - through competitive analysis, we discovered the real gap isn't "more LeetCode problems." It's that nobody practices the communication layer. Speech analytics and behavioral interviews turned out to be the features users found most valuable.

What's Next for BetterThanLeet

Multi-Language Expansion - Add Java, C++, Go, and Rust support for coding interviews
Video Recording and Playback - Record full interview sessions with synchronized audio, code changes, and transcript for later review and sharing
Collaborative Mock Interviews - Two candidates interview each other with AI moderation, scoring, and comparative analytics
Company-Specific AI Personas - Personas trained to mimic the interview style of specific companies (Google's "Googleyness" emphasis, Amazon's Leadership Principles framework)
Mobile App - Voice-first interviews on the go for commute practice
Enterprise API - Let companies use BetterThanLeet for actual candidate screening with customizable rubrics and ATS integration
Interview Marketplace - Connect candidates with human interviewers, using AI analytics to evaluate both sides

Testing Instructions

No test login required. The app runs locally; you start an interview directly from the UI.

Quick run (with API keys)

Prerequisites: Node.js 18+, and API keys for Gemini and (for full voice) optionally Daytona.
Setup: bash git clone https://github.com/nihalnihalani/BetterThanLeet.git cd BetterThanLeet npm install
Environment: Create .env.local in the project root with at least:
- GEMINI_API_KEY (required for voice and analysis)
- DAYTONA_API_KEY and DAYTONA_API_URL (for real code execution; see mock option below)
Run: npm run dev then open http://localhost:3000.
Test flow: Click Start Interview on the landing page → go to the interview page → click Start Interview in the agent panel → allow microphone access. Choose a problem, talk to Alexis, and write code; use Run Code to execute in the sandbox and End Interview & Report to generate the report.

Testing without API keys (mock mode)

To try the UI without Daytona or external APIs, add to .env.local:

NEXT_PUBLIC_USE_MOCK_DAYTONA=true
NEXT_PUBLIC_USE_MOCK_CODERABBIT=true (if used)

Restart the dev server. The interview UI and flow work with mocked sandbox and analysis.

Demo / fallback (Wizard Mode)

If the live AI is unreliable during a demo: enable Wizard Mode in the footer, then press Ctrl+Shift+X to trigger scripted voice lines so the demo can continue.

Gemini Integration Summary

BetterThanLeet is built entirely around Gemini's multimodal capabilities. The core experience is a real-time voice conversation with "Alexis," an AI interviewer powered by Gemini 2.5 Flash Native Audio via the Live API over raw WebSocket connections. Candidates speak naturally and Gemini responds with synthesized speech - handling interruptions, follow-up questions, and dynamic difficulty adjustment in real time.

Gemini Live API drives three distinct interview modes: coding interviews with proactive tool calling (the AI autonomously runs tests and reads code via function calling), system design interviews where Gemini generates live Mermaid architecture diagrams as candidates describe components by voice, and behavioral interviews using STAR-method evaluation.

Gemini 3 Flash powers the analysis layer: generating comprehensive interview reports with hiring recommendations, producing AutoFix suggestions for broken code, and delivering code quality/security/complexity analysis.

Key Gemini features used: Native Audio synthesis, Voice Activity Detection with interruption handling, real-time transcription (input + output), function/tool calling for autonomous code execution, and structured output for reports and analysis. Gemini is not a feature bolted on - it IS the product. Every interaction, from the first greeting to the final report, flows through Gemini.

Built With

coderabbit
daytona
daytona-sdk
docker
elevenlabs
elevenlabs-api
gemini3
google-gemini
google-gemini-api
monaco
next.js-16
node.js
react
react-19
sentry
tailwind-css
typescript
zustand

Updates

Nihal Nihalani started this project — Feb 09, 2026 07:54 PM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.