The open-source CapCut for AI dubbing — dub any short video into 6 languages, locally & free, with a live editor.
Smart auto‑defaults do the first pass; then you override every caption, voice, blur box, font & title with instant preview. Runs on your own GPU. No subscription, no uploads.
Releases · Packaging · dubs into EN / RU / ZH / ES / PT / FR
English · Русский
Important
Dub Studio is beta — and it's 100% not perfect yet. Many features still need polish, and you can help shape where it goes. The big one on the roadmap: making every module swappable — ASR, LLM, vision, TTS — so you can plug in any model and tune the whole pipeline however you like. It's a large effort and I'd love your help: issues, PRs, testing on real clips, and model recipes all move it forward. If it's useful, ⭐ the repo and jump in.
Cloud dubbing tools (HeyGen, Rask, ElevenLabs) charge per minute, upload your footage and your voiceprint, and give you shallow editing. Open‑source CLI tools are powerful but Gradio‑grade — no draggable canvas, no live preview. Dub Studio is the missing middle: a premium live‑preview editor, the only one that also detects, blurs and re‑captions on‑screen text, fully local and free.
Russian original (left) → dubbed into English by Dub Studio (right) — voice and on‑screen text:
| Original — RU | Dubbed to English |
video.mp4 |
output.mp4 |
| 🎙️ Faithful dub | clone the original timbre, auto‑cast per speaker by gender, or pick a pack voice — different voice per speaker |
| 🌍 6 languages | translate speech and on‑screen text; auto‑detects the source language |
| OCR → blur the original → re‑caption localized, in the matched style (the wedge no other tool owns) | |
| 🎬 Live editor | edit transcript, voices, caption style, blur boxes, titles — frame‑accurate preview at every step |
| 🎛️ Caption presets | 26 built‑in looks (karaoke / word‑by‑word / hormozi / neon / …) rendered on your frame |
| 😂 Funny remix | give a theme ("pirate", "as a news report") → the model rewrites the whole script → re‑dub |
| 🔁 Before / after | side‑by‑side original ↔ dubbed — the trust check |
| Dub Studio | OSS CLI tools | HeyGen / Rask / ElevenLabs | |
|---|---|---|---|
| Local & private (no upload) | ✅ | ✅ | ❌ |
| Free | ✅ | ✅ | ❌ |
| Live‑preview editor | ✅ | ❌ | |
| On‑screen text blur + re‑caption | ✅ | ❌ | |
| Portable (one folder) | ✅ | — |
🚀 One-click, cross-platform install via Pinokio — Windows / Linux / macOS · NVIDIA / AMD / CPU:
No
install.bat, no manual CUDA wheels — Pinokio installs the right stack for your machine and the engine falls back gracefully off-NVIDIA. Launcher repo: timoncool/dub-studio-pinokio.
You need: Git · an NVIDIA GPU (RTX 20xx–50xx, CUDA 12.8, ~12 GB VRAM) · several GB of free disk for the wheels and models. You do not need Python, CUDA, Node or ffmpeg pre-installed — the installer fetches them all into the app folder, nothing system-wide.
1 — Clone the repo
git clone https://github.com/timoncool/dub-studio.git
cd dub-studio2 — Install: double-click install.bat (or run it from the folder). One-time setup, entirely inside the folder — it installs:
- embeddable Python 3.11 + pip
- PyTorch 2.8 (CUDA 12.8) + the engine requirements
- llama-cpp-python (Gemma GGUF, cu128) + Triton kernels
- the
dub-enginepackage (editable install) - ffmpeg (NVENC) + Node, then builds the web UI
- the base voice pack
- optional: a NeMo sub-venv for multi-speaker diarization (skips cleanly if it can't build)
3 — Run: double-click run.bat. It launches the local server and opens the editor at http://127.0.0.1:8765. Drop a video → the AI models (Gemma‑4 GGUF + mmproj, Parakeet, Qwen3‑TTS, Sortformer) download on first use, then it dubs. Close the window to stop.
Shortcut: you can skip step 2 — on a fresh clone,
run.batauto-runsinstall.batfor you if the app isn't set up yet. Minimum path: clone → double-clickrun.bat.
Updating: git pull, then double-click update.bat (re-pulls, reinstalls the engine, rebuilds the UI).
| Script | What it does |
|---|---|
install.bat |
one-time setup — Python, CUDA wheels, engine, llama-cpp/Triton, ffmpeg, Node, UI build, voice pack |
run.bat |
start the app at http://127.0.0.1:8765 (auto-runs install.bat on first launch) |
update.bat |
git pull → reinstall the engine → rebuild the UI |
analyze() is the fixed first stage: separate → ASR (word timings) → diarize → context‑translate +
vision (caption style / titles / brands) → OCR (layout / blur boxes). It returns an editable
Project document. Every edit is a patch on that Project with a ~0.14 s CPU preview; export re‑runs
only the dirtied stages. The engine is a self‑contained package in this repo under dub-engine/, bundled into the portable build.
Stack: React 19 + Vite + Tailwind + react‑konva over JASSUB · single‑worker FastAPI · Parakeet TDT (ASR) · Sortformer (diarization) · Gemma‑4‑12B GGUF (translate + vision) · Qwen3‑TTS · ffmpeg/NVENC.
Dub Studio is beta and built in the open — your help is genuinely wanted. Issues, PRs, testing on real clips, and model recipes are all welcome; good‑first‑issues are labeled, and I aim to respond within 24 h.
On the roadmap — great places to jump in:
- Swappable modules — make ASR / LLM / vision / TTS fully pluggable, so anyone can wire in their own model and configure the whole pipeline end‑to‑end. This is the big one.
- Smarter on‑screen‑text localization — colour / contrast matching on tricky backgrounds.
- More voice packs, caption presets, and target languages.
If any of this is your thing, open an issue to claim it — I'm happy to help you get set up.
The app is open‑source; bundled models keep their own licenses (audited before each release).
| Project | What it does |
|---|---|
| Foundation Music Lab | Music generation + timeline editor |
| VibeVoice ASR | Speech recognition (ASR) |
| LavaSR | Audio super‑resolution |
| Qwen3‑TTS | Text‑to‑speech (Qwen) |
| SuperCaption Qwen3‑VL | Image captioning |
| VideoSOS | In‑browser AI video production |
| RC Stable Audio Tools | Music & audio generation |
- Nerual Dreming (t.me/nerual_dreming) — neuro-cartel.com · founder of ArtGeneration.me
- Neuro‑Soft (t.me/neuroport) — portable repacks of neural nets
If this is useful, drop a ⭐ — it helps others find the project and keeps it moving.
I build open-source software and do AI research. Most of what I create is free and available to everyone. Your donations help me keep creating without worrying about where the next meal comes from =)
All donation methods | dalink.to/nerual_dreming | boosty.to/neuro_art
- BTC:
1E7dHL22RpyhJGVpcvKdbyZgksSYkYeEBC - ETH (ERC20):
0xb5db65adf478983186d4897ba92fe2c25c594a0c - USDT (TRC20):
TQST9Lp2TjK6FiVkn4fwfGUee7NmkxEE7C