Inspiration

This was my first-ever hackathon, and I came alone from San Jose. I wanted to build something practical that solves a real communication problem: people missing key information in meetings because of language barriers, audio clarity, or fast conversation pace. I was inspired by the idea that meetings should be accessible to more people in real time, not just after the fact.

What it does

TranslateMate is a real-time meeting captioning and translation assistant built as a Chrome extension + local backend. It captures live tab audio from calls, transcribes speech, translates between selected languages, and overlays captions during the meeting. The goal is simple: make multilingual meetings easier to follow in real time, without forcing users to switch platforms or wait for post-call transcripts. This is a working prototype focused on live captions, translation, and speaker-aware transcript flow. It’s not perfect yet, but it proves the core system works end to end.

How we built it

I built TranslateMate as two connected parts:

Chrome MV3 Extension (extension/)

  • Captures tab audio using offscreen documents
  • Streams audio chunks over WebSocket
  • Renders live caption UI in-page (side panel + live caption bar)
  • Lets users choose spoken language + target translation language during a call

FastAPI Backend (backend/)

  • Receives streamed audio over WebSocket
  • Runs VAD + transcription pipeline
  • Supports ASR via NVIDIA Riva (and Whisper fallback)
  • Translates transcript output and returns caption payloads in real time
  • Handles dynamic language config updates while the session is running

Challenges we ran into

Since I was building solo at my first hackathon, I ran into a lot of issues:

  • audio not being picked up consistently
  • transcription quality dropping with wrong language hints
  • translation fallback/event loop bugs
  • extension/backend sync and restart edge cases
  • noisy logs and config drift affecting debugging A lot of time was spent tuning and stabilizing speech detection + real-time flow under hackathon constraints and of course being solo was difficult.

Accomplishments that we're proud of

  • Built a full end-to-end working prototype completely solo
  • Got live captions and translation running in-meeting
  • Added language selection and live config updates
  • Improved pickup and quality with VAD + pipeline tuning
  • Recovered from repeated runtime failures and still shipped a usable demo For a first hackathon, I’m proud I kept iterating until it actually worked.

What we learned

  • Building Chrome MV3 extension architecture (popup/background/offscreen/content)
  • Real-time audio streaming + websocket pipeline design
  • ASR/translation integration tradeoffs (accuracy vs latency vs reliability)
  • Debugging production-like issues under time pressure
  • How much engineering discipline matters when you’re building alone Most importantly: I learned I can push through uncertainty and still deliver.

What's next for TranslateMate

  • Improve accuracy and robustness across accents/noisy meetings
  • Add stronger diarization and cleaner speaker labeling
  • Add a sensitivity profile toggle in UI (Balanced / High Pickup)
  • Reduce translation latency and improve multilingual quality
  • Add saved transcript export + searchable meeting history
  • Build a cleaner onboarding flow and production deployment path This project still has a long way to go, but this hackathon gave me the foundation and confidence to keep building it.
Share this project:

Updates