FluentHands

Spelling Mode
Spelling Mode
Blitz Mode
Home Page

Inspiration

Due to the unique constraints of ASL, it can be hard to find interactive applications to learn the ASL alphabet in an interesting and fun way. Our team decided to leverage computer vision technologies to allow users to learn the 26 letter ASL alphabet through various interactive games.

What it does

Through utilization of a computer with a webcam, users sign ASL alphabet letters to their computer and it will detect when the user correctly signs a letter. We implemented a variety of games to practice these letters such as a blitz mode where users have limited time to sign the letter, and a word spelling mode where they iteratively spell out words one letter at a time. For users who may not already know the ASL alphabet, no worries! Our practice mode allows users to practice each of the 26 letters at their own pace with reference pictures to help.

How we built it

Our 4 person team split into two groups, one handling the front end and layout of the website, the other handling the logic and letter detection through the computer vision technologies.

Frontend:

We began by building demo pages so the backend team could test gesture classification in real-world environments. That allowed us to validate our ML pipeline early and catch integration issues before moving too far into design. We used React as our frontend framework, structuring everything with a component based architecture and clean routing to keep pages modular and maintainable. For styling, we relied heavily on Tailwind CSS and made thoughtful use of reusable UI artifacts to keep development fast and consistent.

The biggest challenges surfaced when we started implementing our wireframes and gamification concepts. Timing was particularly tricky. We had to ensure game logic aligned precisely with camera load times and gesture detection, which required careful state management and iteration. Animations were another hurdle, since neither of us had prior experience building polished frontend motion effects. There was a learning curve, but through experimentation and refinement, we were able to make the interactions feel smooth and intentional.

Overall, we maintained a steady pace, collaborated effectively, and built something we are genuinely proud of. It is not just functional, it is fun for us to use as well.

Backend:

The backend team began by finding technologies that can easily detect hands on a camera. We found MediaPipe was the simplest solution, easily detecting 21 reference points for a hand with three-dimensional coordinates for each. From there, we implemented basic logical code that can infer hand gestures based on relative finger positions.

This implementation worked but gave a lot of false positives, so we then found a dataset online that gives 21 three-dimensional hand coordinates and their corresponding ASL letter. We utilized TensorFlow to train a neural network on this data and implemented it into our application. From there, any letter detection the neural network couldn't reliably do would fall back onto our now more refined logical implementations.

Challenges we ran into

One challenge we ran into was finding a balance between our neural network implementations as well as our logical implementations. It became apparent that the datasets our model were trained on were biased towards right-handed gestures. For certain letters, this meant we needed to more thoroughly think about our logical implementations to make up for shortcomings in the neural network.

Accomplishments that we're proud of

We're especially proud of the simple, accessible, and fun atmosphere our application conveys. The gamification of our application was a deliberate choice keeping in line with the popularity of other language learning apps like Duolingo.

What we learned

We learned a lot about computer vision through completing this hackathon! It was a field we found especially interesting but had never dove into. Getting to do so for this project was fun and showed to use the potential of Computer Vision technologies.

What's next for FluentHands

The next logical step for our application is to begin detecting simple words, as well as forming simple sentences from these words. One unique challenge of ASL detection through Computer Vision is the usage of facial expressions in the language. Despite this, we're ready to meet these challenges head on to make our application better and better!