FacePlay 🎮

Inspiration

Imagine waking up after shoulder surgery. You can't lift your arms. You're stuck in a hospital bed for weeks, bored, anxious, and isolated. You reach for your controller — and can't use it.

This is the reality for millions of people. According to the Christopher & Dana Reeve Foundation, approximately 5.4 million Americans live with some form of paralysis — nearly 1 in every 50 people. An estimated 2.5 million of those live with moderate to severe mobility impairment specifically of the arm and hand. On top of that, over 50,000 new amputations occur in the United States every year, and limb loss is projected to double to 3.6 million Americans by 2050.

Research consistently shows that video games reduce depression, anxiety, and stress during recovery — studies published in Social Science & Medicine found that games provide distraction from pain, build confidence, and promote a sense of control that accelerates healing. But if you can't use your hands, games are out of reach.

Existing assistive tech like Tobii eye trackers cost $150–$3,000+. We wanted to build something that any hospital could hand to any patient — no special hardware, no setup, no cost barrier.


What It Does

FacePlay is a hands-free game controller that turns your face into a controller using nothing but a standard webcam. Players control games entirely through facial gestures and voice commands:

  • Head movement → directional inputs (left, right, up, down)
  • Eyebrow raise → jump (hold for variable height)
  • Right wink → attack
  • Left wink → dash
  • Full blink → interact
  • Voice commands → pause, resume, and switch game modes

FacePlay supports multiple game modes including Hollow Knight, Stardew Valley, Super Mario Bros, Flappy Bird, Space Invaders, Snake, and Pac-Man — each with controls optimized for that specific game. A dark, neon-lit launcher lets players navigate and select games using only their face.


How We Built It

FacePlay is built entirely in Python using:

  • MediaPipe — two simultaneous detectors running every frame: FaceMesh for 3D head pose estimation and FaceLandmarker for facial expression detection
  • OpenCV — camera pipeline running at 60fps, real-time frame processing and HUD overlay
  • solvePnP + Rodrigues rotation — 3D head pose estimation from 6 facial landmarks to calculate precise pitch and yaw angles
  • Eye Aspect Ratio (EAR) — landmark geometry math to detect winks and blinks
  • Raw landmark geometry — custom brow raise detection using normalized eyebrow-to-nose-bridge distance ratios, replacing unreliable blendshape scores
  • PyAutoGUI — low-latency keyboard simulation with debounce and hysteresis systems
  • SpeechRecognition — Google Speech API running in a daemon thread for voice commands
  • Tkinter — dark-themed game launcher with live webcam preview

Challenges We Ran Into

The hardest problem was gesture reliability. Facial landmarks are noisy — they shift with lighting, head rotation, and natural expression drift. Early versions had eyebrow raises triggering on every blink, winks firing multiple times from a single gesture, and movement keys staying held too long after returning to neutral.

We solved this through several layers of engineering: asymmetric trigger/release thresholds so inputs release faster than they trigger, a hysteresis system for head movement so the neutral zone is easier to return to, a sustain counter requiring consistent detection across multiple frames before firing, and a custom landmark geometry formula for eyebrow detection that uses the nose bridge as a stable reference point instead of eye landmarks that move during blinks.

Running two MediaPipe detectors simultaneously on a single webcam feed without dropping below 30fps was another challenge that required careful optimization — disabling unused landmark refinement, reducing camera resolution, and eliminating PyAutoGUI's default delays.


Accomplishments That We're Proud Of

We're proud that FacePlay works on a $30 webcam with no special hardware. We built a real-time computer vision pipeline doing 3D face pose estimation using PnP solving and Rodrigues rotation matrices, combined with custom landmark ratio calculations — all running at 60fps on a standard laptop.

We're proud of the personalized calibration system that measures each user's neutral face at startup and sets thresholds relative to their baseline. Someone with naturally expressive eyebrows and someone with limited facial mobility can both use FacePlay without manual tuning.

Most of all, we're proud that someone recovering from surgery could actually pick this up and play.


What We Learned

We learned that the gap between "it detects the gesture" and "it feels good to use" is enormous. Every threshold, every debounce frame, every timing constant required iteration. Accessibility technology lives or dies on reliability — a false trigger in a game is annoying; a false trigger for someone with limited mobility who has no other input method is a failure.

We also learned that the right math matters more than the right model. Replacing MediaPipe's blendshape scores with our own raw landmark geometry calculation was a breakthrough — sometimes the best AI is a well-designed formula.


What's Next for FacePlay

  • Gesture prediction — using velocity-based detection to predict gestures before they fully form, making inputs feel as instant as a physical button
  • Background calibration — silently updating thresholds as the user plays, compensating for fatigue and changing lighting without interrupting gameplay
  • Mobile support — running FacePlay on a tablet for true bedside use in hospitals
  • Expanded game library — optimized control schemes for more games, community-contributed mappings
  • Clinical partnership — working with occupational therapists to validate FacePlay as a recovery tool and measure its impact on patient outcomes

Built With

Share this project:

Updates