MedCall

Audio uploading/recording options
Agent landing page with the 3 agents

Inspiration

Healthcare conversations often happen at the most vulnerable moments in a patient's life. Post surgery recovery calls, medication follow ups, or mental health check ins are charged with anxiety. In those moments, critical signals get lost. A patient might casually mention chest discomfort that is actually ischemic pain. A caregiver might describe confusion that could indicate neurological decline. A provider may overlook a potential adverse event that legally must be reported within 24 hours. We asked ourselves a simple question: "What if AI could act as a real time clinical co listener, surfacing high risk signals while the conversation is happening?" MedCall is born from the idea that AI should not replace clinicians but augment their vigilance when cognitive load is highest.

What it does

MedCall is a real time AI powered patient call monitoring system. It listens to live clinical conversations, transcribes them, and runs four specialized agents in parallel:

Emergency Detection Agent\ Identifies high acuity phrases such as chest pain, sudden weakness, slurred speech, or respiratory distress.\ If detected, the system recommends immediate escalation such as calling emergency services.
Adverse Event Detection Agent\ Extracts and flags potential adverse drug events in compliance with pharmacovigilance requirements.\ Ensures time sensitive AE reporting within the regulatory 24 hour window.
Appointment and Adherence Agent\ Detects missed medications, scheduling conflicts, or follow up non compliance and suggests actionable next steps.

The result is structured, actionable intelligence generated from unstructured conversation.

How we built it

MedCall is a full stack system built with React and Flask.

Audio Pipeline: The browser captures microphone input using the MediaRecorder API. Audio is chunked into 3 second windows and streamed to the backend via WebSockets using Socket.IO.
Transcription Layer: Each audio chunk is transcribed using the OpenAI Whisper API. This produces near real time text transcripts.
Parallel Agent Architecture: The transcript is dispatched to four concurrent AI agents powered by OpenAI GPT models. Each agent operates independently on the same transcript segment, allowing specialized reasoning per task. Conceptually, for a transcript segment (T), we compute:

[ A_i = f_i(T), \quad i \in {\text{emergency}, \text{AE}, \text{scheduling}} ]

where each (f_i) represents a domain specific reasoning function implemented via structured prompting.

Real Time Feedback: Results are streamed back to the frontend over WebSockets, enabling immediate clinician facing alerts and structured summaries.

The backend uses Flask for REST endpoints, Flask SocketIO for bidirectional streaming, Eventlet for concurrency, python threading for parallel agent execution. All configuration is managed securely using environment variables.

Challenges we ran into

Speech Emotion Recognition Limitations\ We explored SER models to detect vocal stress markers such as tremor or stutter. However, production ready APIs with reliable clinical accuracy were limited. Integrating emotion recognition in a medically meaningful way remains a challenge.
Latency vs. Safety Tradeoff\ We optimized chunk size to balance responsiveness and transcription accuracy. Smaller chunks reduce latency but increase context fragmentation.
Regulatory Sensitivity\ Adverse event detection must prioritize recall without overwhelming clinicians with false positives. Designing prompts that balance sensitivity and precision required iterative refinement.
Resource Constraints\ Advanced features such as automatic emergency dialing or direct AE submission to pharmaceutical safety portals require deeper integration and compliance review.

Accomplishments that we're proud of

Successfully built a real time, multi agent clinical monitoring system within hackathon constraints
Designed a parallel AI architecture instead of a single monolithic model
Integrated live transcription with concurrent reasoning pipelines
Addressed a real pharmacovigilance compliance use case rather than a generic chatbot scenario
Created a system that feels clinically aware, not just conversational

What we learned

How to architect real time AI pipelines using streaming audio and WebSockets
Practical integration of Whisper for incremental transcription
Prompt engineering for domain constrained reasoning
The operational realities of adverse event reporting in healthcare
The limitations and promise of speech emotion recognition systems

We also learned that building AI for healthcare requires thinking about safety, latency, compliance, and human trust simultaneously.

What's next for MedCall

Integrate a clinically validated speech emotion recognition model
Add structured adverse event auto reporting workflows
Develop clinician configurable alert thresholds
Deploy to a secure HIPAA compliant cloud environment
Explore reinforcement learning for adaptive agent sensitivity Long term, MedCall aims to become an AI clinical co pilot that ensures no critical signal in patient communication is ever missed again.

Built With

axios-for-rest-api-calls-backend:-flask
css
eventlet
flask
gpt3.5turbo
html
html-and-css-for-ui-styling-frontend:-react-with-socket.io-client-for-real-time-communication
javascript
javascript-for-frontend
python
python-ai-and-apis:-openai-api-(gpt-3.5-turbo-for-the-4-parallel-ai-agents
react
socket.io
whisperapi

Updates

Joshita Arora started this project — Feb 15, 2026 02:42 AM EST

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.