Harmon AI

Galaxy audio waveform visualizer
Summary generation
Sentiment detection
Home page
Wake word detection

Project Description: HarmonAI The Problem: In fast-paced group discussions—whether at hackathons, meetings, or collaborative brainstorming sessions—brilliant ideas are often lost in the flow of conversation. Phrases like “That’s good, let’s come back to it” usually lead nowhere. Add language barriers and miscommunication, and the result is lost productivity and frustration. We built HarmonAI to capture those fleeting moments of insight and ensure every voice is heard and remembered.

What It Does: HarmonAI is an intelligent, voice-activated discussion assistant that acts as a moderator and note-taker in real-time conversations.

Key features include:

Live transcription records what is being said
Voice commands like “Harmon, summarize the last few points” or “Record this in the notes”
Sentiment analysis that detects and flags aggressive speech behavior
Real-time language translation to bridge language barrier communication gaps
Firebase integration for storing transcripts and notes
Summarization button to remember all key points
Save previous discussions to revisit them easily

Just say “Harmon” to interact with the AI moderator—one that listens attentively, never forgets, and always contributes. How We Built It:

Backend: Python/Flask server handling audio streaming, Firebase updates, and command logic
Parallel audio processing with segmented audio clips
Audio Intelligence: AssemblyAI API for real-time transcription and speaker recognition
NLP and semantic decomposition to analyze negative sentiment
Database: Firebase Realtime Database for instant data syncing and storage
Wake Word Detection: Porcupine for responsive command activation
AI Inference: Cerebras AI for fast, on-demand natural language processing

Architecture Flow:

Audio is captured and streamed to the backend
Assembly AI processes it into text, tagged by speaker (diarization)
Transcripts are saved to Firebase and saved for AI contextualization
Wake words trigger command detection and Cerebras AI prompting
Negative sentiment detected through Cerebras AI

What We Learned

Real-time audio requires deep handling of buffering, latency, and encoding
Firebase demands thoughtful data modeling for real-time sync
Wake word tuning is delicate—balance between false triggers and missed cues is key

Future Improvements

Detect key phrases such as "Let's revisit this" in order to emphasize points that can be easily forgotten
Support cross device capabilities to allow concurrent discussions remotely
Improve live translation feature to foster meaningful bilingual discussions

HarmonAI is more than just a tool—it’s a glimpse into the future of conversations, where AI actively helps us think, collaborate, and remember better.

Source Code: https://github.com/YGao2005/fullyhacksaudio
Demo Video: (input video)
Team Information(Names + Roles): Stefan Patrascu: Backend developer, Pitch specialist, Supply manager Yang Gao: Team Lead, Fullstack developer, UI developer, Him Brian Kim: Backend developer, AI Engineer
Track: New Frontier and Accessibility

Built With

assemblyai
cerebras
firebase
flask
python
swift

Submitted to

FullyHacks 2025
- Winner #2 Best Use of Cerebras

Created by

I worked on the swift front-end, the backend to frontend integration, speech to text transcription and sentiment analysis using assemblyai, and processing the data with Cerebras

Yang Gao
Brian Kim
Stefan Patrascu

Updates

Brian Kim started this project — Apr 13, 2025 06:57 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.