Local English → Indian-language voice generator for prashnam.ai. Type your content in English, pick from 23 languages (English + 22 Indic), get an MP3 per item per language. Runs on-device after a one-time ~4.9 GB model download — no Hugging Face account, no API keys.
Three project shapes: Poll (1 question + N options), Announcement (flat body segments), IVR menu (branching call flow with a DAG editor + walk simulator).
Highlights:
- Translate + synthesize per segment. Type English; auto-translates and synthesizes audio in every selected language. Multiple takes per cell, plus a hand-edit escape hatch on the translation. Generate one segment at a time or hit Generate all.
- Merge for IVR (polls). Concatenate the question + every option into one MP3 per language — configurable gap, optional end-of-prompt beep, per-language gain slider.
- Option-order rotations. Generate multiple shuffled orderings of poll options to neutralize primacy / recency bias; pin "None of the above" to the last slot.
- Pronunciation lexicon.
BJP=bee jay peeper line, global or per-language — fixes proper nouns once per project.
- Install Python 3.11 if you don't already have it. (3.13 / 3.14 lack the ML wheels — pick 3.11.)
- Get the repo:
git clone https://github.com/prashnam/prashnam-voice.git - Run
install.py— double-click it in Finder/Explorer (macOS/Windows), orpython3 install.pyfrom a terminal.
The script creates a venv, installs dependencies, launches the local server at http://127.0.0.1:8765, and opens the browser — to the setup wizard on first run, to the editor afterwards. Re-run any time; it doubles as the daily launcher and skips the slow pip step once .venv/ exists. Model weights download once from public mirrors at huggingface.co/naklitechie/*; everything runs offline after that.
If port 8765 is busy the launcher walks up to 8775. Force a clean reinstall with rm -rf .venv && python3 install.py.
brew install ffmpeg # macOS — pydub needs the ffmpeg binary
python3.11 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e .
prashnam-voice servegit pull
python3 install.py # picks up new deps and starts the serverOr, if you installed manually:
git pull
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e .
prashnam-voice serveExisting projects under ./projects/ keep working — project.json is forward-compatible (new fields default sensibly on load). The synth cache at ~/.cache/prashnam-voice/audio/ is content-addressed and includes a post-process version, so recipe changes (like a loudness-normalization fix) invalidate stale entries automatically; run prashnam-voice cache-clear to reclaim the disk.
- Visual guide — annotated walkthrough of every view. Open in a browser, or visit
/guidewhile the server is running. - REST API, Python embedding API — for scripting against the running server.
- PLAN.md — roadmap and design decisions.
prashnam-voice --help— every CLI command (serve,generate,batch,prefetch,cache-clear,projects).
Translation: IndicTrans2 (MIT). TTS: Indic Parler-TTS (Apache-2.0). Both are verbatim mirrors of AI4Bharat models — cite AI4Bharat in any research write-up.