-
-
SleepPilot Platform Landing Page
-
RL-Optimized Sleep Factors Simulations Feedback & Recommendations
-
RL ECE (Expected Calibration Error) Reward Model Agent Results
-
Stats-wise Model Reward vs Time Fluctuations in Sleep Factor RL Simulation
-
PPO + Stable-Baseline3 RL Powered Predicted Apnea Risk Scoring
-
General Sleep Quality Feedback Form For Input as RL Sleep Factor Simulation
-
Mel-Spectrogram 5-Second Window of An Apnea Event
Inspiration
We spend nearly 25 years of our lives asleep, yet over 1 billion people suffer from sleep apnea, and 4 out of 5 cases go undiagnosed. A key reason is that diagnosis for Apnea typically requires polysomnography in specialized sleep labs, forcing patients to spend hours hooked up to sensors in a clinical setting, just to understand if they can breathe while they sleep. Sleep disorders like apnea are linked to heart disease, diabetes, and cognitive decline, but even beyond medical conditions, millions of people endure poor sleep quality without realizing that environmental factors like noise, light, humidity, or temperature are disrupting their rest.
Our goal was to build an accessible, fully auditory platform for apnea detection and personalized sleep optimization—we deliberately chose reinforcement learning as our core approach. Unlike traditional ML or pre-trained audio networks, RL allowed us to treat breathing sounds as a sequential decision process, where each moment in the audio influences the next, and model progression of sleep over time.
With this foundation in mind, we built SleepPilot around an RL-driven pipeline, supported by timestamping, classification layers, and personalized environmental factor (light, humidity) optimization through simulation. With just a tap, our mobile app uses the phone’s microphone to record snoring and breathing patterns during sleep. By morning, users receive clear insights into their apnea risk — and over time, the system continuously learns which conditions and settings lead to better sleep quality.
How We Built It
We designed SleepPilot with an RL-heavy 4-layer infrastructure:
Layer 1. General Sleep Quality Evaluation
- Audio is recorded during sleep (breathing, snoring, ambient noise) - data extrapolated from HuggingFace-Hub
- Mel-spectrograms are generated from 5-second windows using
torchaudiofor soundwave visualization - A RNN-LSTM hybrid multi-label model detects snoring, apnea, micro-awakenings, and coughing
- Event counts, sleep duration, and self-reported environment ratings are combined into a Sleep Health Score (0–100)
Layer 2. Sleep Disorder Diagnosis
- An RL agent (PPO, Stable-Baselines3) trained in a simulated Gymnasium environment detects apnea-like events based off of labelled polysomnographical Snore-Apnea audio data
- A Mel-Frequency Cepstral Coefficient (MFCC) algorithm maps a 1D array of float values (soundwave amplitudes) to a _ 2D array _ of timestamps × target features
- We then employ a reward model to the agent using a calibrated reward function ECE (expected calibrated error):
$$ r = \text{correctness} - (\text{confidence} - \text{correctness})^2 $$
This forces the agent to balance confidence calibration with diagnostic accuracy
(i.e., how correct it is vs. how correct it thinks it is)
- Output: probability of apnea, timestamps of apnea events, severity scale, and confidence scores
Layer 3. Personalized Sleep Environment Optimization
- A second RL agent simulates and optimizes environmental conditions — temperature, humidity, light, and noise
- By maximizing predicted improvements in the Sleep Health Score, the agent generates personalized, proactive recommendations (e.g., “lower room temperature by 2°C” or “enable pink noise at 30 dB” or "change room SMART LED's to mahogany")
Layer 4. Mobile Dashboard
- Built in React Native for mobile app deployment to visualize:
- Sleep quality trends
- Snoring intensity
- Apnea risk and severity
- Personalized environment recommendations
- Sleep quality trends
What We Learned
- Confidence calibration in RL matters. Simply flagging “apnea” wasn’t enough, as we needed confidence-adjusted rewards so the model knows how right it feels, not just if it’s right. This was a more unconventional approach compared to traditional binary-reward methods, and the research phase for this was extremely interesting as stats-driven students.
- Audio alone is powerful. We were surprised by how much can be inferred from sound, such as snoring intensity, micro-awakenings, and even sleep fragmentation — all without wearables or intrusive sensors.
- Scalability vs. privacy. Prioritizing processing of only anonymized audio features, and not other multimodalities like CV/MV, was a privacy-smart aspect, as physical image/video data are often regarded as more invasive.
Ultimately, SleepPilot helps people sleep better, wake up healthier, and lower long-term health risks, all by turning sound into actionable sleep intelligence. 🌙
Built With
- fast-api
- gymnasium
- huggingface
- javascript
- kaggle
- ppo
- python
- pytorch
- react-native
- reinforcement-learning
- stable-baseline3
- torchaudio

Log in or sign up for Devpost to join the conversation.