Inspiration
The inspiration comes from the recognition of the growing importance of video content in today's digital landscape. As video content becomes more prevalent, it's becoming increasingly valuable to be able to share content across the world, including non-English users.
What it does
Oh - Cap takes a video input and gets a transcript of everything that was said, as well as a summary of the main points and takeaways. Oh - Cap uses state-of-the-art natural language processing and computer vision to analyze your video and create high-quality text outputs as well as convert the transcript into other languages. Oh - Cap is fast, easy, and affordable. It can save you time, money, and effort by turning your videos into valuable text resources. Oh - Cap can help you make the most out of your content.
How we built it
- Set up a self-hosted file server to store and serve videos for your project.
- Implemented a deep learning model to create a translation server, which allows fast, reliable, and local translation service.
- Back-end based on NestJS.
- Front-end based on React.
- Utilized Google Cloud to enable OAuth requests, which supports uploading videos and subtitles directly from the website to Youtube.
- Created a Docker Compose file to make the project can be easily deployed and run in any environment.
Challenges we ran into
- The deep learning translation model (Argos) was big, (~25 GB). This made it difficult to manage and deploy the model effectively.
- Handling file requests in NestJS.
- Sending requests in React, managing states and asynchronous.
- The lack of API documentation and blog posts for Respell, the file server, and the deep learning model, made it more challenging to troubleshoot and deploy.
- Optimizing the user interface to make it user-friendly and responsive. We had to experiment with different techniques to make the website work.
Accomplishments that we're proud of
- Besides respell, all other components are locally hosted and can run without an internet connection.
- Use DevOps like docker to deploy the application.
What we learned
- React
- NestJS
- Deep learning models
- DevOps
- Handling the prompt and message input into Respell.
- Optimize workflow to get better response time.
What's next for Oh-Cap
- Improve UI/UX exprience
- Add more language to translate.
- Integration to Tiktok, Instagram, Twitter, and more.
- Improve response time and accuracy for generating and translating models.
Built With
- ci/cd
- deep-learning
- devops
- docker
- github-actions
- google-cloud
- google-oauth
- nestjs
- node.js
- python
- react
- respell
- typescript
- youtube
Log in or sign up for Devpost to join the conversation.