Inspiration

Access to reliable medical information is often difficult, especially for people who struggle with navigating complex interfaces or typing long descriptions of their symptoms. We wanted to build something that felt natural, fast, and supportive — a tool that lets anyone simply speak and instantly receive safe, structured, and helpful medical guidance. With Grok’s fast reasoning and multimodal language capabilities, we realized we could create a voice-first assistant that actually feels like talking to a medical professional.

What it does

Groktor is a medical voice assistant that listens to users’ symptoms or questions, processes them using Grok’s real-time understanding, and provides:

Clear summaries of what the user described

Possible non-diagnostic explanations or concerns

Steps they can take next (self-care, when to seek professional help, etc.)

Follow-up questions to narrow down the situation

Voice playback of responses for accessibility

Users can speak normally, interrupt at any time, and receive structured medical guidance in seconds.

How we built it

Frontend: A lightweight web app that captures microphone audio, streams it to the backend, and plays back the assistant’s spoken response.

Backend: A Node/Express service that handles voice input, converts it to text (using Grok’s speech-to-text), sends text to Grok for reasoning, and returns structured medical responses.

LLM: Grok models for real-time medical Q&A, summarization, and safe-response generation with guardrails.

Text-to-Speech: Grok’s TTS pipeline to convert replies back into natural, friendly audio.

Safety Layer: Custom prompt engineering + a check to ensure responses remain non-diagnostic and appropriate.

Challenges we ran into

Latency: Streaming audio while keeping responses quick required tuning buffering and switching to partial-response streaming.

Medical safety: Ensuring the assistant gives helpful but non-diagnostic guidance took multiple prompt iterations and post-processing.

Voice quality: Getting the TTS to sound natural across different symptoms (especially medical terminology) took additional tuning.

Interruptibility: Making Groktor stop and shift mid-sentence when the user starts talking again was surprisingly tricky.

Accomplishments that we're proud of

Built a fully voice-controlled medical assistant in a short time.

Achieved consistent sub-second response streaming using Grok’s reasoning APIs.

Designed a safety-aware medical prompting system that returns structured, actionable advice without pretending to be a doctor.

Made the assistant feel human — the voice, pacing, and follow-up questions feel genuinely conversational.

What we learned

How to work with streaming LLM APIs for real-time interactions.

Best practices for medical AI safety and guardrails.

The importance of user experience in voice systems — small changes in timing dramatically change how “natural” it feels.

How to structure medical conversations so users don’t feel overwhelmed or judged.

What’s next for Groktor

Symptom timeline tracking to help users monitor changes over days or weeks.

Multilingual voice support for global accessibility.

Wearable integration (heart rate, steps, sleep) to give more personalized guidance.

Emergency escalation behaviors that detect severe symptom patterns.

On-device mode for low-connectivity areas.

Clinician dashboard so users can safely share summaries with their doctor.

Built With

Share this project:

Updates