Duovoice

Homepage
Homepage
Login page
About page
About page
Demo page
Practice page
Practice page
Practice page
Practice page
Practice page
Practice page
Chat page

FULL VIDEO IN GITHUB REPOSITORY: https://github.com/ThienNguyen27/Duovoice.git. As we are newbie, we could not manage to fix the Youtube video problem therefore we could not post it in the link. We are really sorry for this inconvenience experience. Thank you for your time and consideration

Inspiration

In a world driven by communication, millions of people who can’t speak or hear still face major challenges in expressing themselves — especially in real-time conversations. While voice and video calls are easy for most, they exclude the Deaf and nonverbal communities.

That’s why we created DuoVoice — a live video communication platform that uses AI-powered sign language recognition and translation, helping users bridge the communication gap in real time.

Whether it's a Deaf individual signing to a hearing person, or vice versa, DuoVoice provides smooth, live, and private interaction — without needing interpreters or external apps.

What it does

Capturing Sign Language DuoVoice uses your webcam to capture live video and detect sign language gestures
Translating Gestures into Text The AI model processes the gestures frame-by-frame, recognizes them, and displays the translated message in real-time to the other participant.
Two-Way Communication All of this happens over a secure WebRTC video call, enabling a smooth, real-time back-and-forth — just like a regular video chat, but inclusive for everyone.

How we built it

Frontend React & TypeScript – For fast, dynamic, and strongly typed UI.
```
Tailwind CSS – For clean, responsive styling.

Next.js – As both a frontend framework and SSR tool.
```

Backend FastAPI – For fast, scalable backend API handling.

    WebRTC (simple-peer, peer.js) – To enable real-time peer-to-peer video communication.

    TURN Server & Socket.io – For reliable video signaling and NAT traversal.

Database Firebase – For real-time storage, user auth, and syncing communication sessions.

AI Model PyTorch + OpenCV + YOLOv8 – Used to detect, track, and classify sign language gestures in live video.

Challenges we ran into

Training sign language models with real-world video data Integrating WebRTC with our backend and AI pipeline Ensuring privacy and low-latency during peer-to-peer communication Building an intuitive and inclusive interface for all users Managing real-time events with Socket.io and signaling servers

Accomplishments that we're proud of

Successfully developed a model that translates sign language to text and text to sign language Created a user-to-user chat interface to enable real-time communication between users

What we learned

Deepened knowledge in real-time communication protocols Gained experience in computer vision and gesture detection Developed strong backend/frontend integration skills Built awareness around inclusive design and accessibility

What's next for Duovoice

Now we only just able to translate sign into text and vice versa, but in the future, we believe that we are able to translate sign into voice and voice into sign which is more convenience.