Inspiration
As a group of multilingual high school friends, we have all witnessed the struggles of our grandparents, parents, and peers when they are unable to comprehend a video due to language barriers. Whether it’s animations, news, movies, or educational videos, the issue of content accessibility for foreign audiences remains a major challenge. The complexity and cost of dubbing — often amounting to over $300 for a 30-minute video — further exacerbate the problem. Additionally, traditional dubbing solutions lack personalization, resulting in voices that do not match the original speaker, diminishing the overall experience.
What it does
ReVoice enables anyone to easily and affordably dub videos into 30 different languages without compromising the original speaker's identity or paying exorbitant fees. With ReVoice, users can ensure that their content reaches a global audience, providing accurate and natural-sounding dubs.
How we built it
Frontend
- Flask: Serves as the web framework, managing the server-side logic and routing.
- DaisyUI with Tailwind CSS: Utilized for rapid UI development and styling, offering a highly customizable and responsive user interface without the need for extensive custom CSS.
- Pydub: A core tool for audio manipulation, handling tasks such as splicing, fading, and layering of music and voice-over tracks.
Backend
- ElevenLabs API: ReVoice integrates with the ElevenLabs API to generate high-quality text-to-speech voices in multiple languages. This allows us to maintain voice authenticity while creating personalized voiceovers.
- NLTK (Natural Language Toolkit): Employed for natural language processing tasks such as tokenizing sentences and distributing them across multiple AI-generated voices, providing a conversational dynamic between speakers.
- Whisper AI (OpenAI): Whisper's advanced speech recognition engine is used for transcribing the original audio before translating it into different languages. The output is highly accurate, ensuring correct translations and contextual understanding.
- FFmpeg: Leveraged for audio extraction, mixing, and final stitching, FFmpeg allows seamless integration of the generated dubs into video files, preserving audio-video synchronization.
- DeepTranslator: Used to translate the transcribed content into over 30 supported languages. This service ensures accurate translations that are used for creating multilingual dubs.
Challenges we ran into
Managing real-time processing for multiple video formats while preserving audio quality proved to be a challenge. Integrating Whisper AI for speech recognition with ElevenLabs' voice synthesis was complex due to the differing input and output formats. Ensuring low latency across a multi-cloud infrastructure required optimization of resource allocation and load balancing.
Accomplishments that we're proud of
We successfully integrated Whisper AI, ElevenLabs, and NLTK to produce high-quality multilingual dubs in a user-friendly web app. The dynamic voice assignments between AI-generated characters enhanced the overall viewing experience, making it feel natural and personalized. We’re especially proud of our audio synchronization, which maintains perfect lip sync despite the varying length of translations.
What we learned
Throughout this project, we gained a deep understanding of multilingual speech synthesis and audio-video processing. We also learned how to optimize microservice architectures to handle large-scale media files and how to maintain performance across geographically distributed cloud resources.
What's next for ReVoice
We plan to further integrate lip-sync technology by implementing Wav2Lip to a higher extent to enhance viewer engagement. Additionally, expanding our language support to include more niche languages and dialects will be our next milestone. We’re also aiming to improve real-time dubbing for live streaming events and virtual classrooms, bridging the gap for international learners.
Regarding Demo
We are unfortunately not live hosting the project due to costs from elevenlabs and security concerns. Please feel free to clone the repository and try it out yourself!
Built With
- daisyui
- elevenlabs
- flask
- html
- javascript
- python
- tailwindcss
- whisper

Log in or sign up for Devpost to join the conversation.