Inspiration
We recently watched a short film called "Feeling Through" which followed a deaf-blind person as the main character. Later we watched an interview with this actor, and in it, he uses sign language to communicate through a video call platform. He needed a translator to properly communicate. Seeing this inspired us to create a program that removes the need for a translator. Not everyone can afford a translator, and translators aren't always readily available.
What it does
Our program will take in a video feed and translate letters that are being signed in ASL, to written English. Since the translation isn't always perfect and to save time signing, the translated text is put through autocorrect. This increases accuracy and allows for shorter signing times. It is then able to speak the word that has been translated out so that it mimics talking.
How We built it
The whole thing was written in python. OpenCV was used to get the camera input. We utilized a library built with PyTorch to help us classify ASL characters. We take the text outputted from the translator and it is run into an autocorrect method from the autocorrect library. Two different key inputs are given one for choosing the autocorrection and one for choosing the original. If the autocorrection is chosen then it replaces the original and is then run through gtts which produces an audio file. The audio file is then played using pygame mixer methods. The current word worked on is cleared and the process repeats for the next one.
Challenges We ran into
Learning the ASL alphabet was quite challenging. Doing text-to-speech on the aoutput text was difficult to do. We had to run Zoom screen share instead of sharing our translated text on a virtual camera to ensure the product would not be laggy. We also had to search for and use the pygame module to play our text-to-speech audio rather than using a system audio player so that this would be compatible with any machine.
myobj = gTTS(text=Text.split()[-1], lang="en", slow=False)
myobj.save("voice.mp3")
pygame.mixer.init()
pygame.mixer.music.load("voice.mp3")
pygame.mixer.music.play()
The frame rate of the user's webcam can also make it difficult to spell long phrases.
Accomplishments that We're proud of
Learning some ASL was pretty cool. Also being able to output text from hand signs and then being able to play that as speech.
What We learned
We had to learn the ASL alphabet to make this. We learned that ASL phrases use motion, so we decided to have our program work character by character, with keyboard controls for the user that expedite the process.
What's next for Silent Speech
Incorporating phrases and increasing accuracy.





Log in or sign up for Devpost to join the conversation.