Half the world is one symptom away from a bad decision
4.5 billion people are not fully covered by essential health services, according to the WHO and World Bank's 2023 Universal Health Coverage report. That is not people in remote villages without roads. That is people in cities who cannot afford a visit, cannot get an appointment, or do not know if what they are feeling is serious enough to bother.
They pull up Google on their phones and search what their symptoms mean, and they're greeted with Reddit posts talking about how the slightest cough means cancer or headache implies early onset Alzheimer's. They close the tab and hope for the best.
We built Vytal because that is not good enough, and because the hardware to do better is already in everyone's pocket.
What Vytal does
Vytal turns a phone's front camera into a vitals monitor, walks you through a symptom conversation powered by Claude, and gives you a plain-language triage summary you can show any clinician, anywhere. No wearable. No oximeter. Just a phone.
30-second face scan. Your skin changes color slightly with every heartbeat because of blood moving through it. Vytal reads those changes from your camera. It is called remote photoplethysmography (rPPG), and our CV pipeline uses it to estimate:
| Vital | Method |
|---|---|
| Heart Rate | FFT peak detection on BVP signal |
| HRV SDNN | Standard deviation of RR intervals |
| HRV RMSSD | Root mean square of successive RR differences |
| Respiratory Rate | Hilbert envelope of the BVP signal |
The algorithm behind it, CHROM, was validated across 117 subjects in de Haan and Jeanne (2013) and showed 92% agreement with contact PPG. An independent study in npj Digital Medicine measured a mean error of 0.94 bpm against an FDA-approved Masimo pulse oximeter, which puts it within the range of clinical wearables. It also holds up across lighting conditions, which matters when your users are not in a lab.
Guided symptom intake. After the scan, Claude asks you 5 to 8 questions based on what you are actually feeling. Not a form. A conversation that adapts based on your answers and ends with a structured symptom summary.
4-agent triage pipeline. Vitals and symptoms go through four steps:
- Vitals Interpreter — checks HR, HRV, and RR against clinical baselines
- Symptom Assessor — looks for patterns and flags concerns
- Triage Agent — combines both and assigns urgency:
emergency/urgent/routine/selfCare - Explainer Agent — rewrites the result in plain language
You get a timestamped health summary instead of "I don't know how to describe it."
How we built it
CV Pipeline — A Python FastAPI server runs on the host machine and receives JPEG frames streamed from the phone. MediaPipe FaceMesh locks onto your face using 468 landmarks and isolates a forehead ROI where the blood flow signal is cleanest.
The CHROM algorithm separates the raw RGB values into two chrominance channels that isolate the pulse signal from noise caused by skin tone and lighting:
$$X_s = 3R - 2G, \quad Y_s = 1.5R + G - 1.5B$$
$$S = X_s - \frac{\sigma(X_s)}{\sigma(Y_s)} \cdot Y_s$$
The signal gets filtered to the physiologically meaningful range (0.7 to 4.0 Hz, which covers 42 to 240 bpm) and FFT pulls the dominant frequency as your heart rate. Respiratory rate comes from the same signal using a separate extraction method.
Flutter app — Handles camera streaming, Claude API calls for both the symptom intake and triage pipeline, and session state across all five screens. Supports English, French, Spanish, and Arabic with full right-to-left layout for Arabic.
Flutter (Dart)
│
├── scan_screen.dart ─── JPEG frames ──► FastAPI CV Pipeline (:8000)
│ │
│ MediaPipe FaceMesh
│ CHROM + bandpass filter + FFT
│ │
├── symptom_chat_screen.dart ◄── VitalScanResult
│ Claude API — branching intake
│ [INTAKE_COMPLETE] → structured JSON
│
└── processing_screen.dart
Claude API — 4-step triage chain
│
└── result_screen.dart
urgency badge + plain-language summary
The hard parts
Signal quality in real lighting. rPPG works well in controlled environments. Getting a clean signal from an Android phone in a dorm room or a poorly lit kitchen took a lot of iteration on the ROI extraction and noise rejection before readings got reliable.
Streaming frames over WiFi. Flutter's camera plugin outputs YUV420 frames. Converting those to JPEG and hitting a FastAPI endpoint fast enough to collect enough signal, without killing latency, meant tuning frame rate and buffer size carefully. On physical devices we also had to sort out firewall rules and network bridging between the phone and the server.
Claude pipelines have to be treated like typed interfaces. Each of the four triage agents receives structured output from the one before it. Small prompt changes in the Vitals Interpreter changed what the Triage Agent decided two steps later. We learned to validate each agent's output format before passing it downstream.
Demo reliability. We added a demo mode (tap the logo three times on the language screen) that skips the CV Pipeline and uses mock vitals. It saved us more than once.
What we learned
rPPG is a signal processing problem, not a computer vision problem. Finding the face takes milliseconds. Pulling a clean pulse signal out of a noisy RGB channel while someone is slightly moving, blinking, and breathing is where the actual work is.
Claude works well as a conversation partner and as a pipeline component, but for different reasons. The symptom intake is good because it adapts naturally. The triage pipeline is good because the output format is strict. Both matter equally.
Triage is not diagnosis. We made that the constraint, not the disclaimer. Every output Vytal produces is framed as something to share with a clinician, not something to act on alone.
What's next
- On-device rPPG inference so the app does not need a separate server running
- iOS support and laptop webcam fallback
- Vital sign tracking across sessions over time
- Formal accuracy testing against clinical hardware across different skin tones
- More languages beyond the current four


Log in or sign up for Devpost to join the conversation.