Moods and Metrics

UI of our website
GIF
Data visualization of emotion
Logo, inspired by Weights and Biases (wandb.ai)

Inspiration

In a world where stress and anxiety are increasingly prevalent, we were inspired to create a tool that not only helps individuals understand their emotional states but also empowers them to take control of their mental well-being. Mood and Metrics was born out of a desire to bridge the gap between technology and emotional health, offering a way for people to easily visualize and comprehend their emotional levels through AI-driven insights.

What it does

Mood and Metrics offers three analysis modalities:

Audio Analysis: Utilizes a locally deployed audio sentiment analysis model to capture arousal, valence, and dominance values to map them onto a stress score scale
Video Analysis: Sends a video clip to Gemini, and evaluates calm/stress levels and in-depth text reasoning analysis through facial expression data
Transcription Analysis: Performs sentiment analysis on video transcription data, evaluating calm/stress levels and in-depth text reasoning analysis
For all modalities, data is visualized in 2D or 3D

How we built it

Utilized a state-of-the-art fine-tuned wav2vec2 transformer architecture (https://arxiv.org/abs/2203.07378) for audio sentiment analysis via Hugging Face and Pytorch
Leveraged three.js to create an interactable 3D graph to map valence, dominance, and arousal
Mapped emotion using a theoretical approach based on a tri-dimensional model of core affect and emotion concepts (https://www.redalyc.org/pdf/3111/311126297005.pdf)
Integrated Google’s Gemini AI API for video-based and transcription-based mood analysis.
Developed a React frontend with an interactive UI

Challenges we ran into

Locally installing transformer model via Hugging Face and Pytorch, and allowing GPU cuda acceleration
Extrapolating accurate stress score from arousal, valence, and dominance values
Getting the axis and rotation from the camera perspective to match for 3D visualizations
Constructing a pipeline to send video footage to Gemini via an API call
Creating a clean frontend to visualize graphs using data from the backend server

Accomplishments that we're proud of

Successfully integrating AI-driven audio, video, and transcription sentiment analysis
Creating an engaging and informational visualization of emotional/mood states
Achieving reliable stress detection for educational, healthcare, and meditational use

What we learned

The nuances of valence, dominance, and arousal in audio sentiment analysis
Balancing technicality and simplicity in data visualization

What's next for Moods and Weights

Enhancing AI models with more modalities such as heartbeat, EKG signals, etc.
Collecting accurate breathing audio to stress score data and training the model from scratch
Adding real-time video analysis without needing pre-recorded clips
Expanding features with personalized stress relief suggestions
Deploying a mobile version for on-the-go, fully local emotional tracking

Built With

d3.js
fastapi
gemini
huggingface
python
pytorch
react
tailwind
three.js

Submitted to

TreeHacks 2025
- Winner Otsuka Valuenex: ekkomi® Advanced Technologies for User Feedback Award ($500 Amazon + $100 ekkomi® matcha gift card per team member)

Created by

I worked on implementing audio recording through the MediaRecorder API and displaying the values through multiple graphs, and led front end development

Corey Shen
UCLA
I worked on implementing the video + transcription modules, API's, frontend components and connecting all the different modules together using React, FastAPI and Gemini

Parth Joshi
I worked on the audio sentiment analysis model via huggingface and pytorch, and created the interactable 3D emotion map using three.js

Ryota Sato
I hope to explore the intersection between audio and engineering by being involved in audio signal processing + machine learning research!!
Dhruv Shah