Inspiration

In the remote learning and work environment, it's tiring and difficult to take notes for meetings while at the same time actively participating. In addition, if you miss an online meeting, and you want to know what happened during the meeting, the only reference you have is an hour-long recording. If you want to find a specific topic during the lecture by sifting through the video, it's hopeless. This is where CatchUp comes in.

What it does

CatchUp is a web application that allows you to upload a recording of a lecture/meeting/virtual event, and generate a customized, searchable transcript enhanced by keyword detection and summaries. It allows you to get a "birds-eye view" of what happened during the lecture. When the user uploads a video, they can see a collapsed list of paragraphs, organized by keywords. Then, if the user wants to find where in the video a certain keyword was mentioned, they can sort through the transcript with AI, click on a paragraph, and the video player will automatically scroll to the timestamp in the video which that paragraph corresponds to. As of now, Catchup is a fully written API, with a WIP web interface.

How we built it

We built a Google Cloud pipeline to streamline to keyword analysis with Google Cloud Storage, Functions, NLP, and Speech To Text. First, we wrote a function that could take a string of text, and extract the key points as well as their importance. Then, we made a function that triggered each time a video was uploaded to Cloud Storage. This function integrated with the keyword extraction, and in the end, we inserted the data into a Mongo database. We used MongoDB with Node for the backend, and Next.JS with React to create the user interface. Additionally, we used TensorFlow along with the state of the art PEGASUS model to create a summary of the text. For handwritten math evaluation, we used the Mathpix API to extract the math and WolframAlpha to evaluate it.

To hear more about how we used Tensorflow and PEGASUS, we have audio descriptions here: https://drive.google.com/drive/folders/1P1P9av3vm3iAKOC5E-j9ByXa06rLYDqf?usp=sharing

Challenges we ran into

We had to start the Hackathon 2 days late because of high school, so we didn't have time to seamlessly integrate every feature we implemented. In addition, it was difficult to develop the UI under time pressure, and we had some minor issues with deploying Cloud Functions which we eventually fixed.

Accomplishments that we're proud of

We're happy that we were able to make a lot of progress in little time. The UI looks really nice, and after some trial and error, we were able to make a full pipeline through Google Cloud to process videos automatically. We were able to implement a lot of advanced features like ML-powered text summary, solving handwritten math problems, and identifying and highlighting keywords.

What we learned

We learned about the power of Cloud Computing and how important it is to do preliminary research to know which resources and APIs are available when starting a project. We also learned how to do OCR for math equations, use various Google Cloud APIs for Natural Language Processing, and create triggers for whenever a file is uploaded.

What's next for CatchUp

The immediate next step for Catchup will be the integration of text summary, a side-by-side video player, and optical character recognition within the recording.

Share this project:

Updates