Inspiration

Got the idea from this video, which is a really cool video from the great minds at Google about the introduction to captioning Youtube videos. When they introduced captions they envisioned so many cool things like searching inside a video. We wanted to make it possible and something that helps everyone get more out of YouTube videos, like making them easier to understand and interact with.

What it does

YouMark doesn't just enhance YouTube videos by adding captions; it actively highlights key segments of the video based on your interests or queries. Imagine being able to jump straight to the parts of a tech review that discuss battery life or camera quality, or to specific explanations in an educational video—all in real-time and multiple languages.

How we built it

Using a Flask backend, we integrated APIs like YouTube Transcript API for pulling transcripts and the Gemini API for semantic analysis to determine the relevance of video segments. The frontend is built with HTML, CSS, and JavaScript, enhancing YouTube's interface to show these insights directly on the video player.

Challenges we ran into

  • Performance Tuning: Initially, the app was sluggish due to sequential API calls and we couldn't allow Gemini API to make a decision for every Transcript segment so we built our own NLP filter before we pass it on to Gemini for further evaluation. We optimized it by implementing parallel processing and smarter pre-fetching of data.
  • Accuracy in Multilingual Contexts: Ensuring accurate translations and relevance in multiple languages was a complex challenge due to linguistic nuances.

Accomplishments that we're proud of

  • Real-time Insights: We successfully implemented a system that highlights important parts of videos, enhancing the viewer's experience significantly.
  • Seamless Integration: The tool feels like a native part of the YouTube experience, making it intuitive for all users.

What we learned

  • Extension Development: Mastered Chrome extension development to deliver a seamless integration.
  • Advanced NLP and ML: Enhanced our skills in natural language processing and machine learning, particularly in real-time content analysis.

What’s next for YouMark

  • Scaling Up: We're looking to scale the service to support an increasing number of users and simultaneously process videos in various languages.
Share this project:

Updates