MediScribe

Real-time AI medical interpreter. Patient speaks their language, doctor sees English + medical flags. Doctor responds, patient hears their language — simplified, translated, and spoken aloud.

Built at HooHacks 2026.

The Problem

37 million Americans have limited English proficiency. In medical settings, miscommunication leads to misdiagnosis, medication errors, and worse outcomes. Professional interpreters cost $150+/hour and are frequently unavailable.

The Solution

MediScribe sits between doctor and patient during a live visit. It does more than translate — it simplifies medical jargon to a 5th-grade reading level, translates to the patient's language, speaks it aloud, and flags clinical signals for the doctor.

Architecture

DOCTOR → PATIENT (Mode 1: Simplify + Translate)
  Doctor speaks/types English
  → ElevenLabs STT (if audio)
  → Gemini: simplify jargon to 5th-grade level + translate to patient language
  → ElevenLabs TTS: speak translated text to patient
  → Frontend: shows original → simplified → translated + audio

PATIENT → DOCTOR (Mode 2: Translate + Grammar Recovery)
  Patient speaks their language
  → ElevenLabs STT (transcribe in source language)
  → Gemini: translate to English + grammar recovery → professional medical English
  → Frontend: shows original → fixed English + medical flags + urgency

Tech Stack

Layer	Technology
Frontend	React 19, Vite 8, Tailwind CSS 4, React Router 7
Desktop	Electron 35 (overlay, system audio capture, tray)
Backend	Django 4, Django REST Framework, Django Channels, Daphne (ASGI)
AI — Translation	Google Gemini 2.5 Flash (jargon simplification, translation, grammar recovery, session summaries)
AI — Speech	ElevenLabs Scribe v1 (STT), ElevenLabs Multilingual v2 (TTS)
AI — Validation	Snowflake Cortex (RAG-powered medical term validation, ICD-10 lookup)
Real-time	WebSockets via Django Channels, Voice Activity Detection (Silero VAD)
Database	SQLite (dev) / PostgreSQL (prod), Redis for channel layers

Quick Start

You need two terminal windows.

Terminal 1 — Backend (Django)

cd HOOHACKS
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env          # fill in your API keys
python manage.py migrate
python manage.py runserver 8000

Backend runs at http://localhost:8000

Terminal 2 — Frontend (React)

cd HOOHACKS
npm install
npm run dev

Frontend runs at http://localhost:5173 (use the exact URL Vite prints if the port changes).

That's it.

Open http://localhost:5173 in Chrome.

Before hacking: run pip install -r requirements.txt whenever requirements.txt changes (otherwise Django may crash on import). Vite proxies /api and /ws to http://127.0.0.1:8000, so the UI talks to Django on the same origin as the dev server — no manual CORS dance. If port 5173 is busy, kill old vite processes or you’ll get a random port without noticing.

Environment Variables

Copy .env.example to .env and fill in:

Key	Required	Source
`GEMINI_API_KEY`	Yes	Google AI Studio
`ELEVENLABS_API_KEY`	Yes	ElevenLabs Dashboard
`SNOWFLAKE_ACCOUNT`	No	Snowflake console
`SNOWFLAKE_USER`	No	Snowflake console
`SNOWFLAKE_PASSWORD`	No	Snowflake console

Everything works without Snowflake — it gracefully skips RAG validation.

Features

Two-Mode Gemini Pipeline

Mode 1 (Doctor → Patient): Simplifies medical jargon to a 5th-grade reading level, translates to the patient's language, generates follow-up suggestions the patient could ask
Mode 2 (Patient → Doctor): Translates patient speech to English, applies grammar recovery to produce professional medical English, extracts symptoms/urgency/follow-up flags

Voice Activity Detection (VAD)

Silero VAD runs in the browser via ONNX Runtime. Auto-detects when someone starts and stops talking (800ms silence threshold). No button required — just speak.

Electron Desktop Overlay

Floating always-on-top overlay for use during any video call (Zoom, Meet, FaceTime)
Captures system audio via getDisplayMedia loopback (auto-granted in Electron)
Compact mode, adjustable opacity, content-protected from screen sharing
Global shortcut: Cmd+Shift+M
System tray with badge notifications

Live Consultation (Two-Client Mode)

Both doctor and patient open MediScribe on separate devices. One creates a session, shares the code, the other joins. Both mics are captured and tagged with the correct direction.

Session History & AI Summaries

When a session ends, Gemini generates a structured clinical summary: chief complaint, reported symptoms, relevant history, urgency assessment, and recommended follow-up actions.

Ask a Question (RAG)

In the Electron overlay, the patient can type questions like "What does troponin mean?" and get plain-language answers powered by Snowflake Cortex (with mock fallback).

Backend API

Method	Endpoint	Description
`GET`	`/api/health/`	Health check
`POST`	`/api/sessions/`	Create a new session
`GET`	`/api/sessions/?provider_id=xxx`	List sessions for a provider
`GET`	`/api/sessions/<id>/`	Get session detail + full transcript
`POST`	`/api/sessions/<id>/end/`	End session, generate AI summary
`GET`	`/api/sessions/<id>/summary/`	Get the AI-generated clinical summary
`GET`	`/api/sessions/<id>/messages/`	Get all messages for a session
`DELETE`	`/api/sessions/<id>/`	Delete a session

WebSocket

ws://localhost:8000/ws/session/<session_id>/

Client → Server:

{ "type": "provider_text", "text": "...", "patient_language": "es" } — doctor types a message
{ "type": "audio_metadata", "direction": "patient_to_provider", "patient_language": "es" } followed by binary audio frame — patient speaks
{ "type": "audio_metadata", "direction": "provider_to_patient", "patient_language": "es" } followed by binary audio frame — doctor speaks

Server → Client:

{ "type": "provider_message", "original", "simplified", "translated", "audio_base64", "follow_up_suggestions" }
{ "type": "patient_message", "original", "raw_english", "fixed_english", "medical_flags" }
{ "type": "processing", "step": "transcribing|gemini|tts", "message": "..." }

Project Structure

HOOHACKS/
├── core/                        # Django project config
│   ├── settings.py              # DB, channels, CORS, logging
│   ├── asgi.py                  # ASGI + WebSocket routing
│   └── urls.py                  # /admin + /api/
├── interpreter/                 # Backend app — all AI logic
│   ├── consumers.py             # WebSocket consumer (Mode 1 + Mode 2 pipelines)
│   ├── views.py                 # REST endpoints (sessions, health, summary)
│   ├── gemini_client.py         # Gemini translation, simplification, summaries
│   ├── elevenlabs_client.py     # ElevenLabs TTS (multilingual_v2)
│   ├── snowflake_client.py      # Snowflake Cortex RAG validation
│   ├── models.py                # Session + Message models (UUID PKs)
│   ├── serializers.py           # DRF serializers
│   ├── auth.py                  # Auth0 JWT middleware (skip in dev)
│   └── routing.py               # WebSocket URL routing
├── src/                         # React frontend
│   ├── pages/
│   │   ├── LandingPage.jsx      # Marketing landing page
│   │   ├── LoginPage.jsx        # Auth page (demo mode)
│   │   └── dashboard/
│   │       ├── LiveConsultation.jsx  # Two-client mode (create/join)
│   │       ├── RealTimeTalk.jsx      # Single-computer mode (VAD + PTT)
│   │       ├── HistoryPage.jsx       # Past sessions + AI summaries
│   │       ├── PatientsPage.jsx      # Patient profile
│   │       └── SettingsPage.jsx      # Preferences (language, mic, display)
│   ├── services/
│   │   ├── api.js               # REST + WebSocket client (with mock fallback)
│   │   ├── audioStream.js       # Mic → WebSocket streaming (250ms chunks)
│   │   └── callCapture.js       # Dual-stream capture (system audio + mic)
│   ├── hooks/useVAD.js          # Silero Voice Activity Detection hook
│   ├── overlay/
│   │   ├── OverlayApp.jsx       # Electron overlay UI
│   │   └── overlay-entry.jsx    # Overlay mount point
│   ├── components/              # Navbar, Footer, Sidebar, BrandMark, Modal
│   ├── context/                 # DashboardSession + Toast contexts
│   └── layouts/DashboardLayout.jsx
├── electron/
│   ├── main.js                  # Electron main process (tray, overlay, IPC)
│   └── preload.js               # Context bridge API
├── manage.py
├── package.json
├── requirements.txt
├── vite.config.js
├── docker-compose.yml           # Postgres + Redis for production
├── demo-guide.md                # Full demo walkthrough
└── .env.example

Supported Languages

Code	Language
`es`	Spanish
`zh`	Mandarin Chinese
`vi`	Vietnamese
`fr`	French
`pt`	Portuguese
`ar`	Arabic
`hi`	Hindi
`ne`	Nepali

Running the Electron App

npm run electron:dev

This starts Vite and Electron simultaneously. The main window loads the dashboard. Toggle the overlay with Cmd+Shift+M or the system tray icon.

To build a distributable:

npm run electron:build

Outputs to /release — DMG/ZIP for macOS, NSIS for Windows, AppImage for Linux.

Running Tests

# Backend end-to-end test (requires running backend)
source .venv/bin/activate
python test_ws.py

# Quick smoke test
curl -s http://localhost:8000/api/health/ | python3 -m json.tool

See demo-guide.md for the full demo walkthrough and troubleshooting.

Team

Sabal — Architecture, Backend, Electron, AI Pipeline
Hung — Co-Architect, Snowflake RAG Integration
Teammate 3 — Frontend Pages (Landing, Login, Dashboard)
Teammate 4 — Live Consultation UI, Overlay

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
core		core
electron		electron
interpreter		interpreter
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
MIC_TEST_INSTRUCTIONS.md		MIC_TEST_INSTRUCTIONS.md
README.md		README.md
TEST.md		TEST.md
TEST_GUIDE.md		TEST_GUIDE.md
demo-guide.md		demo-guide.md
eslint.config.js		eslint.config.js
index.html		index.html
manage.py		manage.py
overlay.html		overlay.html
package-lock.json		package-lock.json
package.json		package.json
railway.json		railway.json
render.yaml		render.yaml
requirements.txt		requirements.txt
start.sh		start.sh
test_mic.html		test_mic.html
test_ws.py		test_ws.py
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MediScribe

The Problem

The Solution

Architecture

Tech Stack

Quick Start

Terminal 1 — Backend (Django)

Terminal 2 — Frontend (React)

That's it.

Environment Variables

Features

Two-Mode Gemini Pipeline

Voice Activity Detection (VAD)

Electron Desktop Overlay

Live Consultation (Two-Client Mode)

Session History & AI Summaries

Ask a Question (RAG)

Backend API

WebSocket

Project Structure

Supported Languages

Running the Electron App

Running Tests

Team

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MediScribe

The Problem

The Solution

Architecture

Tech Stack

Quick Start

Terminal 1 — Backend (Django)

Terminal 2 — Frontend (React)

That's it.

Environment Variables

Features

Two-Mode Gemini Pipeline

Voice Activity Detection (VAD)

Electron Desktop Overlay

Live Consultation (Two-Client Mode)

Session History & AI Summaries

Ask a Question (RAG)

Backend API

WebSocket

Project Structure

Supported Languages

Running the Electron App

Running Tests

Team

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages