Inspiration

I recently was trying to learn from someone's code online for a competition only to realize that it was in Spanish. I was upset, but it actually made me realize how fortunate I was to be a native English speaker. Disproportionately too often educational content is only available in English and this makes it difficult for non-English speakers to access high-quality online education. While their mind may be exceptional, they are limited by their ability to speak a certain language. We wanted to build a tool that could help make educational resources truly global and accessible to everyone, no matter what language they speak.

What it does

AroundtheWorld.Study takes educational videos and translates the spoken content into the user's chosen language, including Spanish, Dutch, Hindi, and Chinese. It uses text-to-speech, translation, and intelligent timing to dub the video naturally without losing synchronization. We also use a finetuned machine learning model to modify the speaker's lip movements so that it looks like they're speaking the new language, making the experience more immersive and less distracting.

How we built it

We combined multiple technologies to make this work:

  • Speech-to-text to transcribe the original video's audio.
  • Translation to convert the transcript into the user's selected language.
  • Text-to-speech to generate the new dubbed audio.
  • Intelligent timing algorithms to keep the new audio synchronized with the video.
  • A machine learning lip-sync model to visually match the instructor’s mouth movements to the new audio
  • Deployable on cloud services like Google Cloud

Challenges we ran into

  • Handling video and audio data were tricky, including matching the speeds and managing file formats
  • Dealing with different APIs for translation, speech recognition, and text-to-speech, and ensuring they all work seamlessly together
  • Getting the lip-sync model to work across different languages and videos

Accomplishments that we're proud of

We're proud that we were able to bring all the moving parts together into a working system. From audio processing to video editing to machine learning, and all in 24 hours.

What we learned

We learned how important system design is when connecting so many complex components. Planning with big-picture diagrams before diving into the code helped a lot. We also got a lot better at debugging audio-video issues, which was new for us

What's next for AroundtheWorld.Study

We want to expand the pipeline to accept videos from different platforms like YouTube, edX, or Coursera or even embed our service directly into these apps

Share this project:

Updates