MeetingAI

MeetingAI Logo

Inspiration

Time and time again, we see the use of meetings becoming diminished. Many people find meetings "pointless and boring" and the tedious work of transcribing the text of a meeting is extremely difficult. While meetings can become tedious, seeing and conversing with people is the essence of human existence, and without it, we would be nowhere, and our civilization would simply be a survival of the fittest. To bring back people to the scene, we created MeetingAI

What it does

Our project takes in an audio file input and returns an immediate transcription of it, based on who spoke and exactly what they said. This technology includes speech to text, and speaker diarization, and is useful for meetings.

How we built it

The logic behind our project was created using python. In python, we use GCP's API's and flask for the server. In addition to this, we used spectral cluster for the clustering algorithm, and we used resemblyzer for the voiceprint library. The front end of our library was built in HTML5, CSS, and JavaScript, simply allowing the upload of a file, and return of a transcript.

Challenges we ran into

At first, we had the issue of properly getting transcriptions and speaker diarization(separating between multiple speakers) on the same platform. Eventually, we were able to organize it based on who spoke(based on a numbered speaker list) and for what time interval they spoke as well. We were able to do this by iterating through a tuple of the speaker value, and the start time, as well as the end time. This allowed us to split the audio file into parts, and then create transcriptions of what that speaker said. Once we were able to incorporate that aspect, the entire logic was able to come together very nicely, and it was smooth sailing from there.

Accomplishments that we're proud of

We are super proud of the machine learning aspect we could come by, and we learned a ton about clustering, speech to text api's, as well as using voiceprint.

What we learned

We learned a ton of machine learning, and we learned the biology behind the voiceprint techniques, as well as how to apply it to this project

What's next for MeetingAI

Next we hope to launch on the app store. Be on the lookout for that!

Built With

css3
flask
gcp
html5
javascript
pydub
python
resemblyzer
spectralcluster
wave

Submitted to

HackTheLib

Created by

I worked on the speaker diarization, clustering speakers based on their voiceprint analysis. I also worked on obtaining files from the server, and sending the text response!

Shrey Jain
Student
I worked on integrating the Google Cloud Speech-to-Text API with the audio file that is inputted so that we could return the translation. I also worked on the HTML, CSS, and JavaScript for the website and syncing the uploading file section of the website with the Flask server.

Shashank Vemuri

Updates

Shrey Jain started this project — Jul 10, 2020 11:27 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.