Inspiration
Every time you talk to a customer agent, you hear that the call is being recorded for quality and training purposes. But does the quality and inspection team have the time to go through 100 of hours of recordings ?
What it does
It takes an audio file as an input (.mp3, .wav) and converts the speech to text. it also identifies the speakers and names them as Speaker A and Speaker B. Prints the summary of the conversation, runs sentiment analysis on the conversation showing customer's satisfaction after the call. User can also download the transcript as a txt file.
How we built it
We built the web application using AssemblyAI's API which runs OpenAI's Whisper model under the hood but with more features like speaker recognition and sentiment analysis and Cohere's API for summarization.
Challenges we ran into
Sometimes silly mistakes can take us hours to debug, one thing I came to learn from this is to keenly look at the documentation.
Accomplishments that we're proud of
This being my first ever hackathon, I am really proud that I was able to make a web application that will really be helpful in making the business flow easier and less painful. I
What we learned
We learned a lot of stuff like making 2 separate API calls with requesting different type of features in each one there by avoiding the Null object error and most importantly, that making a clear plan before diving deep into development is a very necessary step.
What's next for post call analysis
There's a bright future for Post call analysis web app :
- Create a table of past jobs with their transcription and corresponding documents ready to download.
- implement a search feature to search through the transcripts by embedding them into vector store like FAISS to implement features like similarity search.
- Expand the file sources to multi cloud. AWS, Google, Azure.
Built With
- assemblyai
- cohere
- python
- streamlit

Log in or sign up for Devpost to join the conversation.