Communication with those with speech impairments and communication disorders has always been a overlooked matter. We seek to bridge the gap through creating an automated ASL to language translator.
It uses computer vision to identify and compare hand motions to ASL signs to then translate into users' selected language(s), which is then outputted via voice.
We used computer vision to create heatmaps to model various joints in a hand. After taking various screenshots we created and trained the model against a YOLO model to identify and detect hand motions. We further compared real time clips of users' hand motion to the model to ultimately translate from sign language to English sentences. To follow, we used a Google Translate API and Gemini API to ultimately translate to other languages and output in the medium of voice. We further were planning to store all the data to keep for analysis in MongoDB, but were unfortunately not able to finish doing so.
Our major challenges centered around training the model as it was quite time consuming.
However, we're very proud to learn a lot of crucial skills about Machine Learning. We're further proud of a very clean webpage and a massive dub in the Tetris tournament.
We learned a lot about model development and training, and how to incorporate that in a full stack manner.
Originally, the software was supposed to be a mobile APP, so in the near future we will deploy the mobile app version as it's more efficient and convenient for users. The model can definitely be refined on and we believe this is very necessary due to the lack of quality ASL models in the world. Overall, we want to refine various small details about the project. In the long term, we seek to create other more hardware products that contain just a camera and a speaker so users don't always have to carry their phones.
Log in or sign up for Devpost to join the conversation.