Cognify

Inspiration

A lot of learning today happens through videos, especially for technical topics. But when you're watching something and a concept shows up that you don’t understand, you usually have to pause the video and open a new tab to search for it. That breaks the learning flow.

We wanted to build something that makes learning from videos more interactive. Instead of leaving the video to look things up, what if you could just pause and explore the concepts directly on the screen.

What it does

Cognify turns any educational video into an interactive learning experience.

You paste a YouTube video and start watching normally. When you pause, AI analyzes the frame and detects concepts visible on the screen. Interactive markers appear over the video, and you can click on them to get explanations instantly.

Each interaction is tracked and used to update a skill radar chart that shows what topics you’re strong in and where you have gaps. Based on that, LearnFlow recommends what you should watch next.

How we built it

We started by designing the user experience in Figma, prototyping the video player, concept overlays, and learning dashboard.

The frontend was then built with React and Vite, while the backend uses FastAPI and MongoDB Atlas to store user profiles, concept clicks, and learning history.

For recommendations and skill tracking, we used two lightweight ML approaches.

Challenges we ran into

One challenge we ran into early was designing the interaction between the video and the concept overlays. In Figma, we iterated through several prototypes to figure out how concept markers should appear on the paused video without blocking important parts of the screen.

Accomplishments that we're proud of

One thing we're proud of is building a full working loop from concept detection to skill tracking. Instead of just detecting concepts from a video frame, we connected that interaction to a larger learning system that tracks what users explore and updates their skill progress over time.

We’re also proud of the pause-and-detect interaction itself. Being able to pause a video and immediately see concepts appear directly on the screen creates a much more interactive learning experience compared to traditional video platforms.

What we learned

We also learned how to combine different AI tools together in a practical way. Using vision models for concept detection and embedding models for recommendations allowed us to build a system that feels personalized without needing large amounts of training data.