FULL VIDEO IN GITHUB REPOSITORY: https://github.com/ThienNguyen27/Duovoice.git. As we are newbie, we could not manage to fix the Youtube video problem therefore we could not post it in the link. We are really sorry for this inconvenience experience. Thank you for your time and consideration
Inspiration
In a world driven by communication, millions of people who can’t speak or hear still face major challenges in expressing themselves — especially in real-time conversations. While voice and video calls are easy for most, they exclude the Deaf and nonverbal communities.
That’s why we created DuoVoice — a live video communication platform that uses AI-powered sign language recognition and translation, helping users bridge the communication gap in real time.
Whether it's a Deaf individual signing to a hearing person, or vice versa, DuoVoice provides smooth, live, and private interaction — without needing interpreters or external apps.
What it does
- Capturing Sign Language DuoVoice uses your webcam to capture live video and detect sign language gestures
- Translating Gestures into Text The AI model processes the gestures frame-by-frame, recognizes them, and displays the translated message in real-time to the other participant.
Two-Way Communication All of this happens over a secure WebRTC video call, enabling a smooth, real-time back-and-forth — just like a regular video chat, but inclusive for everyone.
How we built it
Frontend React & TypeScript – For fast, dynamic, and strongly typed UI.
Tailwind CSS – For clean, responsive styling. Next.js – As both a frontend framework and SSR tool.
Backend FastAPI – For fast, scalable backend API handling.
WebRTC (simple-peer, peer.js) – To enable real-time peer-to-peer video communication.
TURN Server & Socket.io – For reliable video signaling and NAT traversal.
Database Firebase – For real-time storage, user auth, and syncing communication sessions.
AI Model PyTorch + OpenCV + YOLOv8 – Used to detect, track, and classify sign language gestures in live video.
Challenges we ran into
Training sign language models with real-world video data Integrating WebRTC with our backend and AI pipeline Ensuring privacy and low-latency during peer-to-peer communication Building an intuitive and inclusive interface for all users Managing real-time events with Socket.io and signaling servers
Accomplishments that we're proud of
Successfully developed a model that translates sign language to text and text to sign language Created a user-to-user chat interface to enable real-time communication between users
What we learned
Deepened knowledge in real-time communication protocols Gained experience in computer vision and gesture detection Developed strong backend/frontend integration skills Built awareness around inclusive design and accessibility
What's next for Duovoice
Now we only just able to translate sign into text and vice versa, but in the future, we believe that we are able to translate sign into voice and voice into sign which is more convenience.
Built With
- api
- firebase
- javascript
- nextjs
- python
- tsx
Log in or sign up for Devpost to join the conversation.