VidCraft

Inspiration 🤔

Has there ever been a topic you wanted to learn about, and wished there was a fun, digestible, educational video for it? Whether it's about Newton's laws of physics or Locke's social contract. In today's age, kids consume hours upon hours of video content every week (AACAP, Statista). What if there were a way for curious kids to instantly create a short educational video on any topic they wanted? Or for teachers to created a curated lesson with a custom video and quiz following it.

What it does 😯

Enter World of Vidcraft! Vidcraft is a website that allows users to enter in a topic they want to learn about (e.g. "What are Newton's laws of physics?") and sit back as a whole new video is automatically crafted on the topic they enter, complete with visuals and narration. While watching the video, users can interact with Vidcraft's video chat interface to ask questions about the content they've seen so far.

Upon completing the video, users are taken to a quiz page where their knowledge is tested on the new topic they learned. If they select a wrong answer choice, they are provided an explanation for why it is wrong, so that they can learn from their mistake.

In this manner, Vidcraft allows users of any age to instantly create a novel, never-seen-before educational video on ANY topic they want. Making learning more accessible, easy, and fun than before. All in the matter of a minute. 🖐️🎤

How we built it 🛠️

Vidcraft is built with a React & Next JS front-end hosted on Vercel, with Node as our backend. We use OpenAI’s GPT API to perform tasks such as script writing, image prompt generation, and question generation; OpenAI’s Dall-E API for image generation; and ElevenLabs for text-to-speech generation.

Challenges we ran into

There were a couple challenges we ran into. Several arose because of Node. For one, integrating the ffmpeg and videoshow libraries for video creation posed some issues with errors of missing modules, in particular one saying ‘Module not found ./lib-voc/fluent-ffmpeg’. We tried several things such as modifying our package.json, package-lock.json, reinstalling packages, uninstalling certain ones and consulted GitHub, StackOverflow, and chatgpt. The issues still persisted but we were able to resolve them. We had other issues with Node. Prompt engineering for GPT took some trial and error before getting it to output the desired content in the correct format. Especially because there are several steps to the video generation process, we needed to ensure that GPT reliably generates video scripts and image prompts. We also ran into issues with using image generation APIs due to content policy and moderation of the APIs. Stability AI doesn’t allow prompts with keywords related to children, something GPT might pick as example illustrations for the video. Similarly, OpenAI’s Dall-E API automatically revises prompts and the prompts that it itself revises violate OpenAI’s content policies. In addition to this, the Dall-E API has strict rate limits. These posed some challenges we had to overcome through carefully designed prompting, and handling of cases where certain images may be missing from the video.

Accomplishments that we're proud of

We overcame the challenges we ran into to create a polished website that is fully functional. Given a topic or question as input, it crafts together a cohesive video teaching users about a topic.

We also only exposed our API keys one time 😀.

What we learned

In general, we all gained further knowledge about Node.js as we relied on it heavily for our API logic and video generation, the core of our project. Another significant learning experience was dealing with issues, ranging in everything from sneaky API issues due to content policy moderation (which pervaded us for some time) to issues with libraries like ffmpeg. These challenges pushed us to develop better problem-solving skills and a deeper understanding of the tools we were working with.

What's next for VidCraft

We want to build out additional features. Some ideas include: incorporating Redis to store previously-made videos and quizzes incorporating Redis’s vector database to provide relevant video suggestions via vector search supporting mixed question types for quizzes (so mcq, full response, true-false, multi-select, etc) Increasing the number of images per video (currently we are constrained by rate limits) and doing interpolations between images would also be nice. And we want to start getting users! We’d love for the tool to be used in classrooms and just in general, by people who are curious and want a quick video to learn.

Discord usernames: ishaanjav (Ishaan Javali) aditya.k1 (Adi Kulkarni) tonyyamin (Tony Yamin) alexsima_09642 (Alex Sima)