Hackinator

Inspiration

I have often struggled with making short demo videos explaining my project as I often get distracted by the technicalities or get pressured due to time limitations. This is an attempt to solve this problem that a lot of hackers face.

What it does

It has three core features:

  1. Generate Script and Audio: Given a silent demo video and a GitHub repository link, it analyzes the video frames in context of your code and generates a narrated version with AI-generated voiceover.

  2. Dubbing: Takes a video in any language and dubs it to English while preserving the original timing and video quality.

  3. Improvise: Takes an existing narrated demo, transcribes it, improves the script to sound more polished and professional, then re-generates the audio with the improved version.

How I built it

Backend built with Bun, Express, and TypeScript. Video processing handled by FFmpeg. AI capabilities powered by Google Gemini for frame analysis, transcription, and script improvement. ElevenLabs handles text-to-speech and dubbing. Supabase for video storage. Frontend is React with Vite.

Challenges we ran into

Coming into this hackathon alone without a team and with a self-enforced policy of reducing the use of AI made development challenging but rewarding. The biggest hurdle came at the last moment when I realized that ffmpeg.wasm does not support Node.js or Bun. I had to quickly pivot and rewrite the video processing logic using fluent-ffmpeg instead.

Accomplishments that I'm proud of

  • Wrote most of the backend myself after a long time without relying on AI assistance
  • Built a complete end-to-end pipeline that takes a raw video and outputs a professionally narrated version
  • Successfully handled the ffmpeg crisis under time pressure

What I learned

  • Google Gemini's multimodal capabilities are surprisingly powerful - it can analyze images, transcribe audio, and generate text all in one API
  • Always test dependencies in your target environment early
  • ElevenLabs can be used for some innovative ideas!

What's next for Hackinator

A browser extension could replace the current frame extraction method. Instead of extracting frames at fixed intervals, the extension could intelligently capture frames when significant UI changes occur, leading to more contextually relevant narration. Also considering adding custom voice selection and real-time script preview before generating audio.

Built With

Share this project:

Updates