A real-time AI caregiver system that monitors user behavior through vision, detects unsafe or incorrect actions (e.g., medication errors), and intervenes instantly with voice guidance — while providing caregivers with live insights and alerts.
Patients with dementia or cognitive impairment often make small mistakes that can lead to serious consequences:
- Taking the wrong medication
- Forgetting to take medication
- Leaving unsafe situations unattended
CareSight continuously observes user actions through a camera and:
- Detects incorrect behavior in real time
- Intervenes immediately with voice instructions
- Logs events for caregiver visibility
- Escalates critical situations when necessary
This allows one caregiver to safely monitor multiple patients while preserving independence for the user.
caresight/
│
├── vision/ # Camera + detection
│ ├── camera_stream.py # Webcam feed
│ ├── detection.py # Viam + OpenCV detection logic
│ └── zones.py # Interaction zones
│
├── logic/ # Decision engine
│ ├── rules.py # Safety + correctness rules
│ ├── state.py # Tracks user state
│ └── events.py # Event generation
│
├── intervention/ # Response system
│ ├── llm.py # Gemini API
│ ├── speech.py # ElevenLabs TTS
│ └── alerts.py # Buzzer / hardware
│
├── backend/ # API + event handling
│ ├── main.py
│ ├── event_handler.py
│ └── database.py
│
├── dashboard/ # Caregiver dashboard
│ ├── app.py
│ └── components/
│
├── hardware/ # Optional Pi + sensors
│ ├── buzzer.py
│ ├── ultrasonic.py
│ └── controller.py
│
├── integrations/
│ ├── viam_client.py
│ └── sms.py
│
├── .env.example
├── requirements.txt
└── README.md
| Language | Usage |
|---|---|
| Python | Vision, backend, logic |
| JavaScript | Dashboard |
| SQL | Event logging |
| Tool | Usage |
|---|---|
| Viam | Camera + hardware orchestration |
| OpenCV | Image processing |
| Webcam | Video input |
| Tool | Usage |
|---|---|
| Gemini API | Generates instructions |
| CV models | Object + color detection |
| Tool | Usage |
|---|---|
| ElevenLabs | Text-to-speech |
| Buzzer | Physical alerts |
| Tool | Usage |
|---|---|
| FastAPI | API |
| Hex API | Dashboard |
| PostgreSQL | Event storage |
Camera (Viam) → Detection → Logic Engine → Event → Intervention + Dashboard
User reaches for medication
↓
Camera detects hand + object color (blue)
↓
System compares with expected color (red)
↓
Mismatch detected
↓
Event generated: wrong_med_attempt
↓
Gemini generates instruction
↓
ElevenLabs speaks:
"That is not the correct medication. Please take the red one."
↓
Dashboard logs event
↓
User corrects action
↓
System logs correction
{
"event_id": "evt_001",
"timestamp": "...",
"type": "wrong_med_attempt",
"expected": "red",
"observed": "blue",
"corrected": false,
"severity": "medium"
}- Real-time intervention (not just monitoring)
- Multimodal system (vision + logic + voice)
- Low-cost hardware (< $200)
- Scalable caregiver-to-patient ratio
- Event-driven architecture
- Caregiver inputs expected medication color
- User interacts with pills (M&Ms)
- System detects incorrect choice
- Voice correction is issued
- Dashboard updates
- Correction is tracked
cp .env.example .env
Then open .env and fill in your actual API keys.
IMPORTANT: Never commit
.env— it contains secrets. It's already in.gitignore.
pip install -r requirements.txt
python vision/camera_stream.py
uvicorn backend.main:app --reload
python dashboard/app.py
Built a real-time AI caregiver system using computer vision, event-driven logic, and voice synthesis that detects unsafe patient behavior and intervenes instantly, integrating Viam, Gemini, ElevenLabs, and a caregiver dashboard.
- Not a medical diagnostic tool
- Focused on intervention, not prediction
- Uses color-based detection for MVP reliability
- Real medication recognition
- Activity monitoring (cooking, falls)
- Wearable camera integration
- Behavioral analytics
- Advanced caregiver alerts
YHack 2026 Team
This system is a prototype and does not replace professional medical care. It is intended for demonstration purposes only.