Talk2Doc

A containerized Flask microservice architecture for medical voice assistant functionality, including text-to-speech, speech-to-text, OAuth authentication, and medical AI assistant.

Architecture

The application can run in two modes:

Monolithic: Single service handling all endpoints
Microservices: Separate services for TTS, STT, OAuth, Medical Assistant, and API Gateway

Quick Start

Prerequisites

Docker and Docker Compose
ElevenLabs API key

Setup

Set your ElevenLabs API key (choose one method):

# Option 1: Create .env file (recommended)
echo "ELEVENLABS_API_KEY=your_api_key_here" > .env

# Option 2: Export as environment variable
export ELEVENLABS_API_KEY=your_api_key_here

# Option 3: Pass inline when running
ELEVENLABS_API_KEY=your_api_key_here docker compose up -d

Note: Docker Compose automatically reads .env files from the project root if they exist, but it's optional. You can also use environment variables.

Run the service:

# Using Make (recommended)
make run

# Or using docker compose directly
docker compose up -d

The API will be available at http://localhost:8000

API Endpoints

Text-to-Speech

POST /api/tts
Request body:

{
  "text": "Your text here",
  "voice_id": "TX3LPaxmHKxFdv7VOQHJ" (optional),
  "model_id": "eleven_v3" (optional)
}

Speech-to-Text

POST /api/stt
Send audio file as multipart/form-data with key audio

Health Check

GET /health

OAuth Authentication

GET /api/auth/providers - Get available OAuth providers
GET /api/auth/login/{provider} - Initiate OAuth login (google, github)
GET /api/auth/callback/{provider} - OAuth callback endpoint
GET /api/auth/user - Get current user (requires Bearer token)
POST /api/auth/verify - Verify JWT token
POST /api/auth/logout - Logout endpoint

See OAUTH_SETUP.md for detailed OAuth setup and usage.

Medical Assistant

POST /api/medical/chat - Chat with medical AI assistant

Request body:

{
  "text": "I have a headache",
  "history": "",  // Optional: previous conversation history
  "max_tokens": 512,  // Optional: default 220
  "temperature": 0.2  // Optional: default 0.1
}

Response:

{
  "response": "Medical guidance here...",
  "history": "User: I have a headache\nAssistant: Medical guidance here...\n",
  "success": true
}

POST /api/medical/reset - Reset conversation history

Microservices Mode

To run as separate microservices:

# Start microservices
make microservices-up

# Rebuild and restart with code changes
make microservices-rebuild

# Or step by step:
make microservices-build    # Build images
make microservices-down     # Stop services
make microservices-up       # Start services

This will start:

TTS service on port 8001
STT service on port 8002
OAuth service on port 8003
Medical Assistant service on port 8004
API Gateway on port 8000

Rebuilding Microservices with Code Changes

When you make code changes, rebuild and restart:

# Rebuild images and restart all services
make microservices-rebuild

# Or manually:
make microservices-down
make microservices-build
make microservices-up

Development

Development Mode (Hot Reload)

For development with automatic code reloading when you make changes:

# Start in development mode (with hot reload)
make dev

# Or run in background
make dev-up

# View logs
make dev-logs

# Stop development containers
make dev-down

Note: In development mode, your code changes are automatically reflected without rebuilding the container. Flask's debug mode is enabled, so the server will restart when you modify Python files.

Local Development (Without Docker)

pip install -r requirements.txt
python run.py

Production Docker Commands

# Build image
make build

# Start services
make up

# View logs
make logs

# Stop services
make down

# Clean up
make clean

Project Structure

.
├── app/
│   ├── __init__.py          # Flask app factory
│   ├── config.py            # Configuration
│   ├── routes/              # API routes
│   │   ├── tts.py
│   │   ├── stt.py
│   │   ├── oauth.py
│   │   ├── medical.py
│   │   └── health.py
│   └── services/            # Business logic
│       ├── elevenlabs_service.py
│       └── oauth_service.py
├── Dockerfile
├── docker-compose.yml       # Monolithic setup
├── docker-compose.microservices.yml  # Microservices setup
├── requirements.txt
└── run.py                   # Entry point

Environment Variables

ElevenLabs Configuration

ELEVENLABS_API_KEY: Your ElevenLabs API key (required)
DEFAULT_VOICE_ID: Default voice ID (optional)
DEFAULT_MODEL_ID: Default model ID (optional)

OAuth Configuration

GOOGLE_CLIENT_ID: Google OAuth client ID (optional)
GOOGLE_CLIENT_SECRET: Google OAuth client secret (optional)
GITHUB_CLIENT_ID: GitHub OAuth client ID (optional)
GITHUB_CLIENT_SECRET: GitHub OAuth client secret (optional)
JWT_SECRET_KEY: Secret key for JWT token signing (required for OAuth)
SECRET_KEY: Secret key for Flask sessions (required for OAuth)
OAUTH_REDIRECT_URI: OAuth redirect URI (optional)

Medical Assistant Configuration

HF_TOKEN: Hugging Face API token (required for medical service)
MEDICAL_BASE_URL: Hugging Face endpoint base URL (optional, defaults to provided endpoint)
MEDICAL_MODEL: Medical model name (optional, defaults to "medQA.Q8_0.gguf")

Server Configuration

HOST: Server host (default: 0.0.0.0)
PORT: Server port (default: 8000)
FLASK_ENV: Flask environment (default: production)
FLASK_DEBUG: Enable debug mode (default: False)
SERVICE_TYPE: Service type for microservices mode (tts, stt, oauth, medical, or empty for monolithic)

Complete Voice Assistant Flow

The microservices work together to provide a complete voice-based medical assistant:

User speaks → STT service converts audio to text
Text input → Medical Assistant service generates AI response
Response text → TTS service converts to speech
Audio output → User hears the medical guidance

Example integration flow:

// 1. Convert speech to text
const sttResponse = await fetch('http://localhost:8002/api/stt', {
  method: 'POST',
  body: formData  // Contains audio file
});
const { text } = await sttResponse.json();

// 2. Get medical response
const medicalResponse = await fetch('http://localhost:8004/api/medical/chat', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ text, history: previousHistory })
});
const { response, history } = await medicalResponse.json();

// 3. Convert response to speech
const ttsResponse = await fetch('http://localhost:8001/api/tts', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ text: response })
});
const audioBlob = await ttsResponse.blob();

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
app		app
flutter_app		flutter_app
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
deploy-medical-simple.sh		deploy-medical-simple.sh
deploy-medical.sh		deploy-medical.sh
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.microservices.yml		docker-compose.microservices.yml
docker-compose.yml		docker-compose.yml
gunicorn_config.py		gunicorn_config.py
main.py		main.py
requirements.txt		requirements.txt
run.py		run.py
setup-secrets.sh		setup-secrets.sh
setup_github.sh		setup_github.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Talk2Doc

Architecture

Quick Start

Prerequisites

Setup

API Endpoints

Text-to-Speech

Speech-to-Text

Health Check

OAuth Authentication

Medical Assistant

Microservices Mode

Rebuilding Microservices with Code Changes

Development

Development Mode (Hot Reload)

Local Development (Without Docker)

Production Docker Commands

Project Structure

Environment Variables

ElevenLabs Configuration

OAuth Configuration

Medical Assistant Configuration

Server Configuration

Complete Voice Assistant Flow

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Talk2Doc

Architecture

Quick Start

Prerequisites

Setup

API Endpoints

Text-to-Speech

Speech-to-Text

Health Check

OAuth Authentication

Medical Assistant

Microservices Mode

Rebuilding Microservices with Code Changes

Development

Development Mode (Hot Reload)

Local Development (Without Docker)

Production Docker Commands

Project Structure

Environment Variables

ElevenLabs Configuration

OAuth Configuration

Medical Assistant Configuration

Server Configuration

Complete Voice Assistant Flow

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages