Project Description: HarmonAI The Problem: In fast-paced group discussions—whether at hackathons, meetings, or collaborative brainstorming sessions—brilliant ideas are often lost in the flow of conversation. Phrases like “That’s good, let’s come back to it” usually lead nowhere. Add language barriers and miscommunication, and the result is lost productivity and frustration. We built HarmonAI to capture those fleeting moments of insight and ensure every voice is heard and remembered.

What It Does: HarmonAI is an intelligent, voice-activated discussion assistant that acts as a moderator and note-taker in real-time conversations.

Key features include:

  • Live transcription records what is being said
  • Voice commands like “Harmon, summarize the last few points” or “Record this in the notes”
  • Sentiment analysis that detects and flags aggressive speech behavior
  • Real-time language translation to bridge language barrier communication gaps
  • Firebase integration for storing transcripts and notes
  • Summarization button to remember all key points
  • Save previous discussions to revisit them easily

Just say “Harmon” to interact with the AI moderator—one that listens attentively, never forgets, and always contributes. How We Built It:

  • Backend: Python/Flask server handling audio streaming, Firebase updates, and command logic
  • Parallel audio processing with segmented audio clips
  • Audio Intelligence: AssemblyAI API for real-time transcription and speaker recognition
  • NLP and semantic decomposition to analyze negative sentiment
  • Database: Firebase Realtime Database for instant data syncing and storage
  • Wake Word Detection: Porcupine for responsive command activation
  • AI Inference: Cerebras AI for fast, on-demand natural language processing

Architecture Flow:

  • Audio is captured and streamed to the backend
  • Assembly AI processes it into text, tagged by speaker (diarization)
  • Transcripts are saved to Firebase and saved for AI contextualization
  • Wake words trigger command detection and Cerebras AI prompting
  • Negative sentiment detected through Cerebras AI

What We Learned

  • Real-time audio requires deep handling of buffering, latency, and encoding
  • Firebase demands thoughtful data modeling for real-time sync
  • Wake word tuning is delicate—balance between false triggers and missed cues is key

Future Improvements

  • Detect key phrases such as "Let's revisit this" in order to emphasize points that can be easily forgotten
  • Support cross device capabilities to allow concurrent discussions remotely
  • Improve live translation feature to foster meaningful bilingual discussions

HarmonAI is more than just a tool—it’s a glimpse into the future of conversations, where AI actively helps us think, collaborate, and remember better.

  1. Source Code: https://github.com/YGao2005/fullyhacksaudio

  2. Demo Video: (input video)

  3. Team Information(Names + Roles): Stefan Patrascu: Backend developer, Pitch specialist, Supply manager Yang Gao: Team Lead, Fullstack developer, UI developer, Him Brian Kim: Backend developer, AI Engineer

  4. Track: New Frontier and Accessibility

Built With

Share this project:

Updates