Inspiration
Time and time again, we see the use of meetings becoming diminished. Many people find meetings "pointless and boring" and the tedious work of transcribing the text of a meeting is extremely difficult. While meetings can become tedious, seeing and conversing with people is the essence of human existence, and without it, we would be nowhere, and our civilization would simply be a survival of the fittest. To bring back people to the scene, we created MeetingAI
What it does
Our project takes in an audio file input and returns an immediate transcription of it, based on who spoke and exactly what they said. This technology includes speech to text, and speaker diarization, and is useful for meetings.
How we built it
The logic behind our project was created using python. In python, we use GCP's API's and flask for the server. In addition to this, we used spectral cluster for the clustering algorithm, and we used resemblyzer for the voiceprint library. The front end of our library was built in HTML5, CSS, and JavaScript, simply allowing the upload of a file, and return of a transcript.
Challenges we ran into
At first, we had the issue of properly getting transcriptions and speaker diarization(separating between multiple speakers) on the same platform. Eventually, we were able to organize it based on who spoke(based on a numbered speaker list) and for what time interval they spoke as well. We were able to do this by iterating through a tuple of the speaker value, and the start time, as well as the end time. This allowed us to split the audio file into parts, and then create transcriptions of what that speaker said. Once we were able to incorporate that aspect, the entire logic was able to come together very nicely, and it was smooth sailing from there.
Accomplishments that we're proud of
We are super proud of the machine learning aspect we could come by, and we learned a ton about clustering, speech to text api's, as well as using voiceprint.
What we learned
We learned a ton of machine learning, and we learned the biology behind the voiceprint techniques, as well as how to apply it to this project
What's next for MeetingAI
Next we hope to launch on the app store. Be on the lookout for that!
Built With
- css3
- flask
- gcp
- html5
- javascript
- pydub
- python
- resemblyzer
- spectralcluster
- wave


Log in or sign up for Devpost to join the conversation.