Inspiration
This project was inspired by our group's desire to improve not only our presentation skills, but our fluency in languages other than English. All of our group member's were raised in bilingual households and some of us even learned a third language. For many of our presentations, especially in a foreign language, we always searched for a tool to accurately comprehend us and provide feedback, but haven't found one. Language Learner was created to solve this problem and the Whisper model from OpenAI was utilized for language processing.
What it does
Language Learner listens and provides feedback on one's presentation skills by first having the user input their presentation text. The user then records an audio sample of their presentation. The Whisper model powered by OpenAI comprehends the audio and outputs any mistakes made compared to the input text. Feedback in the form of percentage words correct and some presentation tips are provided. Language Learner can recognize 99 languages with varying comprehension, including accents and dialects. Mild background noise also does not affect the program.
How we built it
Language Learner was built with the Whisper model from OpenAI and Gradio in Python. Gradio was used to build the user-interface and Whisper provided the language processing. Python was the primary programming language for this application and was used to program all the backend logic.
Challenges we ran into
Since Whisper from OpenAI is a relatively new model, having been released in September 2022, finding documentation and help from the internet was difficult. Much information does not exist yet for this model or many OpenAI APIs. ChatGPT was utilized for many troubleshooting issues since finding information online was difficult for this application.
Accomplishments that we're proud of
This is our first time learning how to use Artificial Intelligence in an application. We were always intrigued by machine learning and AI, but we never had a chance to build something using those concepts. We found out that applying AI was difficult at first because we were inexperienced and there was not much information online dealing with the APIs we were using.
What we learned
We learned how to apply AI models and OpenAI APIs to a web application. We also learned how to use Gradio to build a user-interface.
What's next for Language Learner
Since we used Gradio to build the user-interface, we can build a full stack application using other frameworks and libraries to make the app look more professional. We can improve this application by using HTML, CSS, Javascript and Flask to build the application instead of Gradio.
Log in or sign up for Devpost to join the conversation.