Skip to content

IanCou/Dialex

Repository files navigation

Dialex

A real-time AI debate training platform with live transcription, Gemini-powered coaching, emotional analysis, ELO ranking, and global analytics.


Features

Debate Engine

  • Structured rounds — Opening (3 min) → AI Opening → Rebuttal (2 min) → AI Rebuttal → Closing (1 min) → AI Closing
  • Topic + side selection — Choose from preset topics or enter your own; argue For or Against
  • Difficulty levels — Easy / Medium / Hard / Expert (each maps to an ELO bracket for matchmaking)
  • AI opponent — Gemini 2.5 Flash generates contextual speeches, played via ElevenLabs TTS

Live Transcription (ElevenLabs)

  • Browser captures mic audio via MediaRecorder and streams binary chunks over WebSocket to the backend
  • Backend accumulates the full audio buffer and sends it to ElevenLabs REST STT every ~3s for interim results
  • Final transcription triggered on round end

AI Coaching (Gemini 2.5 Flash)

  • After each user speech turn, Gemini analyzes the full transcript and returns 3 coaching tips (good / suggestion / warning)
  • Per-entry feedback annotations shown inline in the live transcript
  • Intervention monitor — polls every 10 seconds for Point of Order (POO) or Point of Information (POI) opportunities; shows an animated alert banner (POO = red, POI = amber)

Voice & Emotion Analysis (Hume AI)

  • Records 4-second audio clips every 5 seconds during a debate
  • Sends to Hume Expression Measurement streaming API (prosody model)
  • Maps Hume's 48 raw emotions to 4 display categories: Confidence, Conviction, Calmness, Engagement
  • Real-time WPM calculation using delta word count between ticks

Knowledge Retrieval (RAG — ChromaDB)

  • Semantic evidence search against a ChromaDB vector store
  • Pre-seeded with 20 debate evidence entries covering AI regulation, UBI, nuclear energy, social media, crypto, space, parliamentary procedure, and more
  • Upload custom evidence files or ingest text snippets in-session
  • Shown in the right panel during debates for quick evidence lookup

Video Feed & Visual Analysis

  • Webcam captured at 5fps, mirrored via canvas, streamed over WebSocket to backend /stream
  • Visual analytics (gesture, eye contact, posture scores) accumulated during debate and shown on the results page

Post-Debate Results Page

  • Dual AI summaries — Gemini 2.5 Flash and K2 Think V2 (MBZUAI IFM) analyze the full transcript independently and show side-by-side assessments
  • K2-specific features — logical fallacy detection, chain-of-thought reasoning breakdown
  • Argument flow — visual map of how arguments developed and were rebutted across the debate
  • Segment-by-segment review — each user speech turn scored across dimensions (Argumentation, Evidence, Delivery, Rebuttal, Structure) with individual strengths and improvements
  • Detailed review summary — radar chart + score rings across all dimensions, plus a final overall score
  • ELO update — displayed as a badge (+/- ELO) with new rating shown immediately after the debate
  • Voice analytics — WPM pacing bar with target zone (130–160 WPM) and Hume emotional profile
  • Full transcript — collapsible, with inline Gemini feedback per entry

ELO & Ranking System

  • Standard ELO formula with K=32
  • Win threshold: score ≥ 60
  • AI opponent ELO by difficulty: Easy=1000, Medium=1400, Hard=1800, Expert=2200
  • Rank tiers: Beginner → Intermediate → Advanced → Expert → Master → Grandmaster

Leaderboard

  • All users ranked by ELO descending
  • Shows name, ELO, wins, losses, and tier badge

Dashboard

  • Personal stats: ELO, wins, losses, win rate, rank progress bar
  • Recent debate history with scores and ELO changes

Debate History

  • Full history list per user
  • Individual debate detail pages

Global Analytics (Hex)

  • Embedded Hex.tech dashboard at /global-stats
  • Shows win rate vs difficulty, logical fallacy frequency, Hume emotional scores, and ELO progression across all users
  • Data exported via GET /api/history/export

Tech Stack

Layer Technology
Frontend Next.js 14, React, TypeScript, Tailwind CSS, shadcn/ui
Backend FastAPI, Python 3.9
AI Coaching Google Gemini 2.5 Flash
AI Analysis MBZUAI K2 Think V2
Transcription ElevenLabs REST STT
Text-to-Speech ElevenLabs TTS (Rachel voice, eleven_turbo_v2_5)
Emotion Analysis Hume AI Expression Measurement (prosody)
Vector Store ChromaDB (persistent, local)
Database JSON flat-file (users_db.json)
Analytics Hex.tech (embedded iframe)

Setup

Prerequisites

  • Python 3.9
  • Node.js 18+
  • ffmpeg (required for Hume audio conversion)
brew install ffmpeg

Backend

cd backend
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Create backend/.env:

ELEVENLABS_API_KEY=...
GEMINI_API_KEY=...
HUME_API_KEY=...
DATABASE_URL=sqlite:///./dialex.db
CHROMA_PERSIST_DIR=./chroma_db
ALLOWED_ORIGINS=http://localhost:3000

Start the server:

uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

API docs: http://localhost:8000/docs

Frontend

cd frontend
npm install

Create frontend/.env.local:

NEXT_PUBLIC_BACKEND_URL=http://localhost:8000
NEXT_PUBLIC_API_URL=http://localhost:8000
NEXT_PUBLIC_HEX_APP_URL=https://app.hex.tech/019d3775-c0a1-700d-8635-53b17531662a/app/032qCqqyoiPDwMpBeipGqh/latest

Start the dev server:

npm run dev

App: http://localhost:3000


API Endpoints

Method Path Description
GET / Welcome
GET /health Health check
WS /api/transcribe/ws Audio transcription WebSocket
POST /api/debate/analyze Gemini coaching tips
POST /api/debate/intervention POO/POI detection
POST /api/debate/generate-speech AI opponent speech generation
POST /api/debate/summarize Post-debate summary (Gemini or K2)
POST /api/debate/argument-flow Argument flow map
POST /api/tts/speak ElevenLabs TTS
POST /api/hume/analyze Hume prosody analysis
POST /api/rag/query Evidence search
POST /api/rag/ingest Add evidence
POST /api/rag/upload Upload evidence file
POST /api/rag/seed Seed default evidence
GET /api/rag/stats Collection stats
POST /api/history/save Save debate + update ELO
GET /api/history/list/{email} User debate history
GET /api/history/leaderboard All users by ELO
GET /api/history/user/{email} User stats
GET /api/history/export Export all data (for Hex)
POST /api/review/segment Score a single speech segment

Environment Notes

  • Gemini quota — Free tier exhausts quickly; 429 errors reset daily
  • ElevenLabs — Uses REST API (not WebSocket streaming); WebSocket returns 403 on Creator plan
  • Hume — Requires HUME_API_KEY; ffmpeg must be installed for audio conversion
  • RAG — Auto-seeds on first debate start; ChromaDB persists to backend/chroma_db/ (gitignored)
  • Python 3.9 — Avoid X | Y union syntax; use Optional[X] / List[X] from typing

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors