Inspiration
We wanted to be able to converse with elderly people in our family who might speak a different language from us in real time.
What it does
Two people hop onto a video call and choose their preferred language of communication. The app translates any language outside of that into captions in their preferred language at the bottom of their video stream. As an added there is a text-to-speech feature to hear the audio in their preferred language, eradicating language barriers!
How we built it
A lot of hard work and effort <3
Challenges we ran into
- Establishing connections over WebRTC
- Figuring out how to send audio to our backend APIs for transcription and translation
- Getting the captions showing for the correct participant
- Getting the translated text-to-speech working using ElevenLabs
- Gauging the right threshold to use as a cutoff for our audio volume
- Running out of conversations to have in different languages while running on minimal brainpower
Accomplishments that we're proud of
- Establishing connections over WebRTC
- Figuring out how to send audio to our backend APIs for transcription and translation
- Getting the captions showing for the correct participant
- Getting the translated text-to-speech working using ElevenLabs !!!
- Having conversations in different languages while running on minimal brainpower
What we learned
- A lot about networking protocols! WebRTC has a lot of security features with built-in mandatory encryption
- Using OpenAI's Whisper for transcription + translation
- Combined we're very multilingual
What's next for RealTalk
- Apply voice imitation to allow the text-to-speech to mimic the user’s actual voice (we actually did implement this! just a couple minutes after the deadline)
- Support group calls <3
- Add extra layers of security by implementing end-to-end caption encryption (probably using the AES-256-GCM cipher)
Log in or sign up for Devpost to join the conversation.