Inspiration

Wayne has always been a big believer in ASL and providing support for zoom meetings. ViSign primarily serves to bridge communication gaps between the deaf and hearing communities. By converting ASL, a visual language, into subtitles, ViSign makes interactions more accessible and inclusive, allowing deaf or hard-of-hearing individuals to communicate more easily with those who do not understand ASL.

What it does

ViSign detects sign language gestures and converts that into sentences, powered by Google's Gemini.

How we built it

We built ViSign using React and Next.js, enabling us to create a more responsive and interactive interface. We styled the application with Tailwind CSS to maintain consistency in our application. For the backend, we use Flask to handle all the API requests. We use Python to integrate machine learning models and to handle data processing tasks. OpenCV was used to process video inputs and enhance our ability to detect ASL. We used Gemini to generate sentences based off of the outputs received from our models. After retrieving these words, we used OpenAI's API to generate a text-to-speech mp3 and stream it real time. For hosting and real-time data processing, we leveraged Amazon EC2.

Challenges we ran into

One of the major challenges we faced was ensuring the accurate translation of ASL into English. Early in our testing, we encountered issues where the application would misinterpret signs. For example, the signs for "help" would be misinterpreted as "elephant" or "cow". Another prominent challenge that we ran into was integrating Google's Gemini. Figuring out the perfect prompt was one of the most difficult because Gemini would spit out a few random sentences from time to time.

Accomplishments that we're proud of

We are particularly proud of ViSign's ability to provide real-time translation with high accuracy. ViSign is successfully able to facilitate online meetings, enabling deaf and hard-of-hearing participants to engage more fully in conversations that would otherwise be inaccessible.

What we learned

Throughout the development of ViSign, we learned a great deal about the complexities of ASL. We gained insights into the technical challenges of real-time video processing and the integration of different APIs to create a better/seamless user experience.

What's next for ViSign

Looking forward, we aim to expand ViSign's capabilities to provide more support to a more vast community. We aim to include more languages and dialects, making our tool accessible globally.

Built With

Share this project:

Updates