RealTalk

Inspiration

We wanted to be able to converse with elderly people in our family who might speak a different language from us in real time.

What it does

Two people hop onto a video call and choose their preferred language of communication. The app translates any language outside of that into captions in their preferred language at the bottom of their video stream. As an added there is a text-to-speech feature to hear the audio in their preferred language, eradicating language barriers!

How we built it

A lot of hard work and effort <3

Challenges we ran into

Establishing connections over WebRTC
Figuring out how to send audio to our backend APIs for transcription and translation
Getting the captions showing for the correct participant
Getting the translated text-to-speech working using ElevenLabs
Gauging the right threshold to use as a cutoff for our audio volume
Running out of conversations to have in different languages while running on minimal brainpower

Accomplishments that we're proud of

Establishing connections over WebRTC
Figuring out how to send audio to our backend APIs for transcription and translation
Getting the captions showing for the correct participant
Getting the translated text-to-speech working using ElevenLabs !!!
Having conversations in different languages while running on minimal brainpower

What we learned

A lot about networking protocols! WebRTC has a lot of security features with built-in mandatory encryption
Using OpenAI's Whisper for transcription + translation
Combined we're very multilingual

What's next for RealTalk

Apply voice imitation to allow the text-to-speech to mimic the user’s actual voice (we actually did implement this! just a couple minutes after the deadline)
Support group calls <3
Add extra layers of security by implementing end-to-end caption encryption (probably using the AES-256-GCM cipher)