Inspiration
Healthcare conversations often happen at the most vulnerable moments in a patient's life. Post surgery recovery calls, medication follow ups, or mental health check ins are charged with anxiety. In those moments, critical signals get lost. A patient might casually mention chest discomfort that is actually ischemic pain. A caregiver might describe confusion that could indicate neurological decline. A provider may overlook a potential adverse event that legally must be reported within 24 hours. We asked ourselves a simple question: "What if AI could act as a real time clinical co listener, surfacing high risk signals while the conversation is happening?" MedCall is born from the idea that AI should not replace clinicians but augment their vigilance when cognitive load is highest.
What it does
MedCall is a real time AI powered patient call monitoring system. It listens to live clinical conversations, transcribes them, and runs four specialized agents in parallel:
- Emergency Detection Agent\ Identifies high acuity phrases such as chest pain, sudden weakness, slurred speech, or respiratory distress.\ If detected, the system recommends immediate escalation such as calling emergency services.
- Adverse Event Detection Agent\ Extracts and flags potential adverse drug events in compliance with pharmacovigilance requirements.\ Ensures time sensitive AE reporting within the regulatory 24 hour window.
- Appointment and Adherence Agent\ Detects missed medications, scheduling conflicts, or follow up non compliance and suggests actionable next steps.
The result is structured, actionable intelligence generated from unstructured conversation.
How we built it
MedCall is a full stack system built with React and Flask.
- Audio Pipeline: The browser captures microphone input using the MediaRecorder API. Audio is chunked into 3 second windows and streamed to the backend via WebSockets using Socket.IO.
- Transcription Layer: Each audio chunk is transcribed using the OpenAI Whisper API. This produces near real time text transcripts.
- Parallel Agent Architecture: The transcript is dispatched to four concurrent AI agents powered by OpenAI GPT models. Each agent operates independently on the same transcript segment, allowing specialized reasoning per task. Conceptually, for a transcript segment (T), we compute:
[ A_i = f_i(T), \quad i \in {\text{emergency}, \text{AE}, \text{scheduling}} ]
where each (f_i) represents a domain specific reasoning function implemented via structured prompting.
- Real Time Feedback: Results are streamed back to the frontend over WebSockets, enabling immediate clinician facing alerts and structured summaries.
The backend uses Flask for REST endpoints, Flask SocketIO for bidirectional streaming, Eventlet for concurrency, python threading for parallel agent execution. All configuration is managed securely using environment variables.
Challenges we ran into
- Speech Emotion Recognition Limitations\ We explored SER models to detect vocal stress markers such as tremor or stutter. However, production ready APIs with reliable clinical accuracy were limited. Integrating emotion recognition in a medically meaningful way remains a challenge.
- Latency vs. Safety Tradeoff\ We optimized chunk size to balance responsiveness and transcription accuracy. Smaller chunks reduce latency but increase context fragmentation.
- Regulatory Sensitivity\ Adverse event detection must prioritize recall without overwhelming clinicians with false positives. Designing prompts that balance sensitivity and precision required iterative refinement.
- Resource Constraints\ Advanced features such as automatic emergency dialing or direct AE submission to pharmaceutical safety portals require deeper integration and compliance review.
Accomplishments that we're proud of
- Successfully built a real time, multi agent clinical monitoring system within hackathon constraints
- Designed a parallel AI architecture instead of a single monolithic model
- Integrated live transcription with concurrent reasoning pipelines
- Addressed a real pharmacovigilance compliance use case rather than a generic chatbot scenario
- Created a system that feels clinically aware, not just conversational
What we learned
- How to architect real time AI pipelines using streaming audio and WebSockets
- Practical integration of Whisper for incremental transcription
- Prompt engineering for domain constrained reasoning
- The operational realities of adverse event reporting in healthcare
- The limitations and promise of speech emotion recognition systems
We also learned that building AI for healthcare requires thinking about safety, latency, compliance, and human trust simultaneously.
What's next for MedCall
- Integrate a clinically validated speech emotion recognition model
- Add structured adverse event auto reporting workflows
- Develop clinician configurable alert thresholds
- Deploy to a secure HIPAA compliant cloud environment
- Explore reinforcement learning for adaptive agent sensitivity Long term, MedCall aims to become an AI clinical co pilot that ensures no critical signal in patient communication is ever missed again.
Built With
- axios-for-rest-api-calls-backend:-flask
- css
- eventlet
- flask
- gpt3.5turbo
- html
- html-and-css-for-ui-styling-frontend:-react-with-socket.io-client-for-real-time-communication
- javascript
- javascript-for-frontend
- python
- python-ai-and-apis:-openai-api-(gpt-3.5-turbo-for-the-4-parallel-ai-agents
- react
- socket.io
- whisperapi
Log in or sign up for Devpost to join the conversation.