Orator

Landing
Live Coaching

Inspiration

We were inspired by how many people struggle with public speaking, even when they have great ideas to share. Too often, presentations fail not because of bad content but because of nerves, body language, or lack of confidence. We wanted to create a way for anyone to practice and get real feedback without needing a personal coach. That is how the idea for Orator was born, to make confidence measurable and trainable through AI.

What it does

Orator is an AI-powered presentation coach that analyzes your speech, gestures, and EEG brain activity to help you become a better speaker. It tracks how you speak, move, and feel while presenting, then gives real-time feedback on your performance. By detecting patterns like filler words, awkward gestures, or signs of stress, it helps users identify exactly what to improve. The goal is simple, to turn nervous practice into confident performance.

How we built it

We built Orator using Flask for the backend, React for the frontend, and YOLOv11 for gesture detection. We used the Google Cloud Speech-to-Text API to analyze speech and a Muse S EEG headband connected via BrainFlow to capture real-time brain signals. All three data streams, audio, video, and EEG, were synchronized to provide unified feedback. Our system processes inputs live and generates an analysis report after each presentation session.

Challenges we ran into

The biggest challenge was managing and synchronizing real-time data from multiple sources without lag. Processing audio, video, and EEG signals simultaneously pushed the limits of both hardware and software performance. We also faced difficulty getting consistent EEG readings and training our gesture model to recognize subtle movements. Despite these challenges, we learned how to fine-tune latency, improve model accuracy, and maintain stable live feedback.

Accomplishments that we're proud of

We are proud of building a working prototype that successfully integrates speech recognition, computer vision, and EEG analysis into one platform. Our system can detect gestures and analyze stress levels while generating accurate transcriptions in real time. We also designed a clean, interactive frontend where users can record and instantly see their results.

What we learned

We learned how to handle complex multimodal data streams and keep them in sync for real-time feedback. This led us towards working with websockets, something our team was inexperienced in but got it working in the end through the guidance of a mentor. Working with EEG data also taught us the basics of hardware and how challenging brain-signal interpretation can be, but also how rewarding it is when it works. We gained experience integrating multiple APIs and machine learning models into a cohesive system.

What's next for Orator

We'll see!

Built With

brainflow
flask
google-cloud-speech-to-text-api
html5
numpy
openai
opencv
pandas
pytorch
react
tailwind
typescript
websockets

Submitted to

natHacks 2025
- Winner 3rd Place - Standard Division (Multiple Tracks)

Created by

Main presenter and team lead. Made speech-to-text and LLM feedback realtime using websockets. Designed landing page. Improved accuracy of YOLOv11 guesture detection model. Created pitch deck and did market research.

Austin Bao
I implemented the YOLOv11-based detection system and integrated the video-recognition pipeline for the project. My work focused on developing and tuning the model to detect user behaviors such as head-tilting, excessive pacing, and hip-swinging, and generating real-time feedback based on these movement patterns.

Kibo Amran
I worked on emotion detection functionality using EEG signal from ['TP9', 'AF7', 'AF8', 'TP10'] channels. I designed the UI of the presentation preparation (dashboard) and the actual presentation practice mode page

Jion Choi
I worked on the speech to text functionality in the backend, the landing page, and the pitch deck.

David Xia
Raphael Ho