What is KeyGuard?
Passwords protect the moment you log in. But what happens after? Anyone who walks up to an unlocked terminal already has full access. In hospitals, banks, and government offices this is a real problem. Shared workstations are everywhere and sessions get handed off constantly.
KeyGuard watches how you type and continuously verifies that the same person is still at the keyboard. Not what you type, how you type. The rhythm, the timing, how fast your fingers move between specific letter pairs. That pattern turns out to be surprisingly unique to each person.
How I built it
The whole stack is Python, running entirely on device. I used pynput to capture keystroke timing events, then built a feature extraction pipeline that segments your typing into active bursts and computes 26 features per window.
Feature groups:
- Dwell time, inter-key flight time, pause rate, backspace rate, burst WPM
- Digraph timing: mean transition time for the 20 most common English letter pairs (th, he, in, er, an...)
I trained an ensemble of three models on your enrollment data:
| Model | Role |
|---|---|
| Isolation Forest | Primary ML detector, 200 trees |
| One-Class SVM | Corroborating signal, RBF kernel |
| Z-score anomaly detection | Primary decision maker on selected feature mask |
Every 40 keystrokes the system scores a new window against your baseline. Two consecutive anomalous windows triggers an alert and locks the session.
I also added online learning. The model continuously updates itself using exponential moving average (alpha = 0.005), so it adapts to how your typing naturally changes over time. It only learns from verified windows, so an intruder can never slowly train the model to accept them.
Results
I tested KeyGuard against a real second user. The system flagged 20 out of 20 of her windows correctly.
| Metric | Result |
|---|---|
| Detection rate | 20/20 windows (100%) |
| False positive rate | ~1/10 windows on enrolled user |
| Isolation Forest score on intruder | 1.000 on all 20 windows |
| PCA centroid separation | 3.62 units |
Z-score flagged: 20/20. Isolation Forest flagged: 6/20. Either detector flagged: 20/20.
Z-score was the dominant detector here, catching every window with an average z-max of 6.84 standard deviations from the enrolled baseline. The Isolation Forest returned maximum anomaly score (1.0000) on all 20 windows, meaning it correctly identified Megan as maximally anomalous in every case — the 6/20 flag count reflects threshold sensitivity at this enrollment size, not a failure to detect.
Feature separation between the two users:
| Feature | Gabe (enrolled) | Megan (test) | Z-separation |
|---|---|---|---|
| backspace rate | 0.027 | 0.116 | 5.06 |
| pause rate | 0.071 | 0.164 | 4.14 |
| std flight | 0.106 | 0.174 | 2.96 |
| burst WPM | 75.5 | 61.0 | 2.78 |
| std dwell | 0.144 | 0.090 | 0.80 |
| mean dwell | 0.117 | 0.112 | 0.17 |
Backspace rate was the strongest signal — Megan made roughly 4x more corrections per keystroke. Pause rate and flight time variance also separated cleanly. Mean dwell time, by contrast, was nearly identical between the two users, which is why aggregate typing metrics alone are insufficient.
What I learned
Feature engineering matters more than model complexity. I went through six iterations of the detection architecture before landing on something that worked reliably.
Mahalanobis distance looked great on paper but fell apart with only 16 enrollment windows. The One-Class SVM achieved perfect detection on the test user but false-positived constantly on me. Z-score anomaly detection on a carefully selected feature mask ended up being the most reliable primary detector, with the ML models corroborating.
The other big lesson: aggregate typing speed is not a stable biometric. People type at different speeds depending on what they are thinking about. Switching to burst-isolated windows, where idle time is ignored and only active typing is measured, made the features dramatically more stable.
Challenges
Getting pynput to capture precise timing on Windows without UI thread interference took some work. The feature pipeline had several subtle bugs where overlapping key presses corrupted dwell time calculations. Tuning the detection threshold to minimize false positives on the enrolled user while maintaining sensitivity to a different user required a lot of real typing data from two actual people.
Privacy
Everything runs locally. No data ever leaves the device. Key content is never captured, only timing metadata. This makes KeyGuard viable for HIPAA, FERPA, and enterprise zero-trust environments where sending biometric data to a cloud API is not an option.
Built with
Python pynput scikit-learn numpy tkinter
Built With
- pytho

Log in or sign up for Devpost to join the conversation.