Inspiration

We've all stood in front of a room presenting, finished speaking, and walked away not really knowing how it went. Sometimes a teacher gives quick feedback, sometimes classmates clap, but most of the time the moment just passes. Watching our peers present, we noticed the same thing over and over again, Speakers had no way of knowing when the audience stopped paying attention or how their speaking habits changed under pressure. That gap, between how a presentation feels and how its actually received, is what pushed us to build PitchWise.

What it does

PitchWise records a presentation and analyzes it from two perspectives at once. It listens to the speaker to measure filler words and speaking pace, while using video to track how many people in the audience are actually watching. Instead of general comments like, "be more engaging", the app gives concrete quantifiable results that shows what happened during the presentation and when.

How we built it

We built PitchWise by combining speed processing with computer vision. The audio system identifies filler words and calculates words per minute, while the video system detects audience presence and attention. The face detection uses Haar Cascades using OpenCV and Mediapipe to accurately detect faces, and uses angles and similarities of frames to detect focus and eye movement. We used Expo Go to deploy the app, and since Expo Go did not have high performance react-native computer vision models, we sent each frame of the pitch to a centralized computer to process the frame using a python backend. It uses AI-Powered Feedback in order to give tips on improvement and key insights. A lot of our time was spent testing in real environments, like classrooms with background noise, movement, and distractions.

Challenges we ran into

One of the hardest parts was accepting that human attention isn't perfect or predictable. People look away, shift in their seats, or glance down birefly, and we had to decide what actually counts as engagement. We also ran into technical challenges relating to what actually counts as engagement, when people would look to the side it would still detect them as paying attention, so we had to make sure to focus on what direction their eyes are looking in. Many early versions would crash and slow internet was a huge time consumer especially in the beginning. Pods and Expo were especially challenging when trying to install the required cross-platform dependencies, and slow image processing was hard when not recieving continuous video feed. To solve this, we added various libraries like Protobuf and Async, which helped transmit data fast.

Accomplishments that we're proud of

We're proud that PitchWise gives speakers something they usually never get, quantifiable feedback, on both their delivery and their audience. Seeing the app successfully track filler words while also measuring audience attention in real time. More importantly, we're proud that the feed it provides is consistent, no opinons, no bias, just quantifiable data.

What we learned

We learned that improvement starts with awareness. Technically we gained experience working with real-time audio and video analysis. As a team, we learned how much iteration it takes to turn an idea into something usable, and how important it is to test assumptions in real situations rather than relying on theory alone.

What's next for PitchWise

For PitchWise version 2.0.0, we want to redefine how we measure engagement, improve accuracy across different room sizes, different environments, and make the feedback easier to understand at a quick glance. Our long-term goal is to help speakers walk away from presentations knowing exactly what went well and what they can improve next time on. We also want to scale to listeners farther away so that any video feed gets processed well.

Built With

Share this project:

Updates