Inspiration
Since technology has become such an integrative part of our lives now, we often find ourselves more connected than ever, yet emotionally further apart. The very tools built to bring us closer sometimes strip away the subtle cues and reasons that make communication human. And while disconnecting from technology isn’t realistic, since most of our conversations now live in text threads and video calls, we believe empathy shouldn’t get lost in translation. Wanting to build something meaningful, we thought of a new way to reconnect human emotions on a higher scale; a new way to bridge that emotional gap that seems so hard to achieve nowadays. Thus comes a simple, private, real-time coach inside Google Meet that helps people feel understood again.
What it does
The project is a Chrome extension that seamlessly integrates into Google Meet, acting as an intelligent conversational layer. What it does is integrate chat messages in real time and connects to a lightweight local Python app that monitors the remote participant’s video tile to detect emotional cues. We wanted to simplify the concept as much as possible. With these emotional cues, it selects an appropriate reply tone: whether its supportive, calm, reassuring, concise, or enthusiastic, and then generate three smart one-click reply suggestions that can be instantly pasted into the chat. Everything runs locally on the user's device, ensuring speed, privacy, and full control over your data.
How its built
Sentimix bridges computer vision and NLP to make conversations emotionally aware. The vision system, built in Python with OpenCV and DeepFace using RetinaFace or an OpenCV detector, captures the remote video tile through MSS screen capture. Each second, it streams the dominant_emotion and confidence via a local WebSocket.
On the browser side, a Chrome extension (Manifest V3) observes the Meet chat DOM in real time. The NLP layer, powered by wink-sentiment, analyzes polarity and applies intent rules (question, logistics, request). The Tone Engine blends emotion, sentiment, and intent to select tone and generate replies. The UI is a floating panel inside Meet with a mood badge, tone label, confidence bar, and one-click reply chips. Users can adjust behavior with a probability threshold slider, manual tone override, or Simple Mode, all running fully on-device, ensuring speed and privacy.
Challenges
Early on, our model took forever to load, so we added a warm-up call to make it feel instant. Google Meet’s DOM was volatile, so we wrote resilient selectors with fallbacks. Low light confused emotion detection, so a confidence gate was added. Webcam conflicts? That was solved with smart probing and clear, friendly error messages. One of our important features to focus on were also accessibility, so we chose high contrast, large fonts, and text labels to make the interface universal.
What was learned
- Real time beats perfect. A steady one hertz label helps more than a slow model. *Throughout the process, we discovered that simple NLP rules work great. Instead of huge AI models, small, clear rules for chat intent (like spotting a question or request) covers most real situations.
- We realized that adding a threshold and Simple Mode makes the UI calmer. The threshold filters out low-confidence emotions, and Simple Mode removes extra visuals so users don’t feel overwhelmed.
- And finally, we found that good lighting helps both humans and AI. Bright faces make the vision model more accurate, and a bright, clean interface makes people trust what Sentimix shows them.
Results
Our system delivers stable, real-time emotional guidance at one to three hertz, even on a standard mid-range laptop, proving that meaningful AI doesn’t require massive infrastructure. The interface feels native to Google Meet, with a clean, minimal overlay that offers instant, one-click reply suggestions precisely tuned to the tone of the conversation. Every component runs entirely on-device, with no cloud calls or external dependencies, ensuring both speed and complete privacy. It’s real-time empathy, processed locally, felt globally.
How to run
- Create a Python 3.10 venv and install requirements.
- Start the broadcaster, select the remote video ROI.
- Load the Chrome extension as unpacked.
- Join a Meet, open chat, use the panel.
- Press Ctrl+Alt+R to reselect ROI.
Next steps
- Multi-participant support with multiple ROIs.
- Keyboard shortcuts for replies.
- Tiny on-device intent model.
- Packaged installer and Web Store listing.
Privacy
- All processing runs locally.
- User interface and simple acessibility matters to us- so that means no recordings. No uploads.
- One click to pause or disable.
Built With
- chrome-extension-(manifest-v3)
- css
- deepface
- html
- javascript
- keyboard-(hotkeys)
- mss-(screen-capture)
- mutationobserver
- numpy
- opencv
- python-3.10
- retinaface
- tensorflow-cpu
- vs-code
- websockets
- websockets-(python)
- windows
- wink-sentiment
Log in or sign up for Devpost to join the conversation.