OLD YOUTUBE LINK DOES NOT WORK. Please use this one! https://youtu.be/-tGvg3GNYaU
Inspiration
Talk2Doc was inspired by a simple but powerful idea: what if patients could speak naturally about their problems and symptoms and immediately receive care, guidance and directed to a medical facility? Many people delay medical attention because booking appointments is slow, confusing, or intimidating.
We simply wanted to create a system that eliminates the barriers between patients, their candid thoughts, and access to medical guidance.
What it does
Talk2Doc lets patients speak naturally about their symptoms and get instant guidance. It listens, understands, and talks back, and if needed, it can even schedule a doctor’s appointment automatically.
How we built it
Model: turquise/MedQA_q8 Frameworks: Python, Flask Tools and Technology: ElevenLabs, Flutter, HuggingFace, Google Cloud, Google OAuth, GitHub OAuth, Model Context Protocol, Docker, NexHealth API
A user speaks into the app. Their voice is transcribed in real time using ElevenLabs Speech-to-Text, routed to a fine-tuned medical LLM — turquise/MedQA_q8.
The generated text is converted back into natural speech using ElevenLabs Text-to-Speech. The entire experience is streamed back through a Flutter mobile application, over Docker and Google Cloud Platform.
With Model Context Protocol (MCP), the system can automatically schedule a doctor’s appointment when required, using the NexHealth Synchronizer API for expert advice.
What makes Talk2Doc unique is that it doesn’t stop at conversation. Using MCP integrated with the NexHealth Synchronizer API, the system can automatically schedule a doctor’s appointment when required.
Challenges we ran into
The primary challenge was hosting the ML model. Given that the Turquoise/MedQA_q8 model is fine-tuned over the 8B parameter Meta-Ilama model, deploying and inferring the model across various platforms proved to be difficult. However, we were able to solve it by utilizing the HuggingFace Inference Endpoints.
Accomplishments that we're proud of
Built a fully voice-native healthcare assistant with real-time STT, LLM inference, and TTS streaming. Successfully fine-tuned and deployed a medical-grade 8B LLM (MedQA_q8) on a Hugging Face Inference Endpoint. Designed a scalable Flask microservices architecture orchestrated with Docker and Google Cloud. Integrated MCP with the NexHealth Synchronizer API to automatically schedule doctor appointments from model decisions. Achieved an end-to-end pipeline from patient speech to clinical action in a single system.
What we learned
Building Talk2Doc taught us how to design real-time AI systems that combine streaming audio, LLM inference, microservices orchestration, and external healthcare APIs. We learned how to manage latency across multiple services, handle secure authentication flows, and design agentic workflows where models trigger real-world actions.
We also gained hands-on experience integrating voice AI, medical LLMs, and clinical scheduling systems into a single cohesive pipeline.
What's next for Talk2Doc
- Medical professionals can access the pre-recorded conversation between patients and the model if the patient consents.
- In the event of severe healthcare disruptions, alerting emergency services becomes crucial.
- Creation of Electronic Health Records (EHR).
Built With
- api
- docker
- elevenlabs
- flask
- flutter
- github-oauth
- google-cloud
- google-oauth
- huggingface
- llama
- model-context-protocol
- nexhealth
- python


Log in or sign up for Devpost to join the conversation.