Gesturify

I nspiration We wanted to make sign language practice Inspiration We wanted to make sign language practice accessible and feedback-driven. Many learners rely on videos or an instructor—real-time, objective feedback is hard to get at scale. We built Gesturify to let anyone practice American Sign Language (ASL) with instant scoring and actionable tips using just a webcam and a lightweight backend.

What it does Gesturify captures hand landmarks from a webcam, sends them to a FastAPI scoring service, and returns a numeric score, pass/fail, and short tips to help the learner improve their sign. The frontend (Vite + React) runs in the browser, the backend (FastAPI) exposes endpoints like /api/words and /api/attempts for word lists and attempt evaluation, and Firebase handles authentication and user data storage when enabled.

How we built it

Frontend: Vite + React app for camera capture, landmark extraction pipeline, and a clean practice UI. Backend: FastAPI service that accepts frames of 21 hand landmarks per frame and runs the scoring logic. API docs are available via Swagger UI at /docs when running locally. Data flow: The frontend collects landmark frames, POSTs them to /api/attempts with the target word; the backend runs comparison/scoring logic and returns score, passed boolean, and tips. Dev tools & infra: Python 3.8+ with uvicorn for the API, npm for frontend, and optional Firebase integration for auth/storage (team members set up their own service account keys per backend/TEAM_SETUP instructions). Challenges we ran into

Landmark variability: different webcams, lighting, and hand orientations produce noisy landmarks; designing robust scoring that generalizes was tricky. Real-time UX vs. accuracy: keeping the UI responsive while sending enough frames for a reliable score required balancing sample size and latency. Data & modeling: building a meaningful scoring metric without a large labeled dataset meant relying on heuristics and iterative tuning, which limited accuracy for complex signs. Team setup: requiring individual Firebase service keys added friction for teammates during development and demos. Accomplishments that we're proud of

End-to-end demo: a working pipeline from webcam capture to backend scoring producing human-readable tips. Clear, documented API: interactive docs and example request/response bodies make the backend easy to test and extend. Modular design: frontend and backend are cleanly separated (Vite + React + FastAPI) so future models or UI improvements can be swapped in quickly. Practical guidance: the scoring output includes actionable tips (e.g., finger positioning or palm orientation) rather than only a score. What we learned

Practical hand-pose work requires careful preprocessing and normalization of landmarks to reduce noise. Small datasets and heuristics can be useful prototypes, but accuracy and fairness improve substantially with more labeled examples and community input. FastAPI + Vite is an excellent developer experience for hackathon timelines—fast iterations and clear API testing. Privacy matters: minimizing what gets stored, and making Firebase integration optional, are important considerations when working with camera data. What’s next for Gesturify

Improve scoring with a learned model trained on a larger, curated dataset and community-sourced examples. Expand vocabulary and include multi-word phrases and sentences. Add progress tracking, user accounts, and lesson plans (Firebase-backed). Mobile-friendly capture and offline-first support for low-connectivity users. Usability and accessibility testing with Deaf and hard-of-hearing community members to validate effectiveness and refine tips.