Izwi

Local-first audio inference engine for TTS, ASR, and voice AI workflows.

Website • Documentation • Releases • Getting Started

Overview

Izwi is a privacy-focused audio AI platform that runs entirely on your machine. No cloud services, no API keys, no data leaving your device.

Core capabilities:

Voice Mode — Real-time voice conversations with AI
Text-to-Speech — Generate natural speech from text
Studio — Build long-form TTS projects and exports
Speech Recognition — Convert audio to text with high accuracy
Speaker Diarization — Identify and separate multiple speakers
Voice Cloning — Clone any voice from a short audio sample
Voice Design — Create custom voices from text descriptions
Forced Alignment — Word-level audio-text alignment
Chat — Text-based AI conversations

The server exposes OpenAI-compatible API routes under /v1.

Quick Install

macOS

Download the latest .dmg from GitHub Releases:

Open the .dmg file
Drag Izwi.app to Applications
Launch Izwi

Linux

wget https://github.com/izwi-ai/izwi/releases/latest/download/izwi_amd64.deb
sudo dpkg -i izwi_amd64.deb

Windows

Download and run the installer from GitHub Releases.

Full installation guides: macOS • Linux • Windows • From Source

Quick Start

1. Start the server

izwi serve

Open http://localhost:8080 in your browser.

2. Download a model

izwi pull Qwen3-TTS-12Hz-0.6B-Base

3. Generate speech

izwi tts "Hello from Izwi!" --output hello.wav

4. Transcribe audio

izwi pull Parakeet-TDT-0.6B-v3
izwi transcribe audio.wav

Long-form ASR is handled automatically: Izwi now chunks long recordings, stitches overlapping transcripts, and returns a full transcript instead of only the first model window.

Optional tuning knobs:

IZWI_ASR_CHUNK_TARGET_SECS=24
IZWI_ASR_CHUNK_MAX_SECS=30
IZWI_ASR_CHUNK_OVERLAP_SECS=3
# Optional: preload models at server startup to reduce first-request cold latency.
# Comma-separated model IDs (for example Whisper-Large-v3-Turbo,Qwen3.5-4B)
IZWI_PRELOAD_MODELS=Whisper-Large-v3-Turbo
# Optional: run a short synthetic ASR warmup after preloading (enabled by default).
IZWI_WARMUP_PRELOADED_MODELS=1
IZWI_ASR_WARMUP_DURATION_MS=800
# Optional: tune text streaming queue depth when using per-character ASR streaming.
IZWI_STREAM_TEXT_QUEUE_CAPACITY=4096

Anonymous Analytics (Desktop)

Izwi desktop supports optional, opt-in anonymous usage analytics powered by Aptabase.

Disabled by default until users explicitly opt in.
Can be enabled during onboarding or later in Settings.
Users can opt out at any time.
No prompts, transcripts, audio payloads, local paths, or personal identifiers are sent.

To enable analytics transport in the desktop shell, set the app key in the runtime environment:

APTABASE_APP_KEY=A-US-XXXXXXXXXXXXXXX

Use the exact key from Aptabase (for example A-US-... or A-EU-...).

Without this variable, analytics calls are treated as no-op events.

Supported Models

Category	Models
TTS	Qwen3-TTS 12Hz (0.6B Base/CustomVoice, 1.7B Base/CustomVoice/VoiceDesign), Kokoro-82M
ASR	Qwen3-ASR GGUF (0.6B, 1.7B), Parakeet-TDT-0.6B-v3, Whisper-Large-v3-Turbo
Diarization	Sortformer 4-speaker
Chat	Qwen3 GGUF (0.6B, 1.7B, 4B, 8B), Qwen3.5 GGUF (0.8B, 2B, 4B, 9B), LFM2.5 (1.2B Instruct/Thinking GGUF), Gemma 3 (1B)
Audio	LFM2.5-Audio-1.5B-GGUF
Alignment	Qwen3-ForcedAligner-0.6B (full, 4-bit)

Run izwi list to see all available models.

Full model documentation: Models Guide

Documentation

Resource	Link
Getting Started	izwiai.com/docs/getting-started
Installation	izwiai.com/docs/installation
Features	izwiai.com/docs/features
CLI Reference	izwiai.com/docs/cli
Models	izwiai.com/docs/models
Troubleshooting	izwiai.com/docs/troubleshooting

License

Apache 2.0

Acknowledgments

Qwen3-TTS by Alibaba
Parakeet by NVIDIA
Gemma by Google
HuggingFace Hub for model hosting

Name		Name	Last commit message	Last commit date
Latest commit History 940 Commits
.github/workflows		.github/workflows
crates		crates
data		data
docs		docs
images		images
scripts		scripts
tasks		tasks
ui		ui
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
Dockerfile.dev		Dockerfile.dev
LICENSE		LICENSE
README.md		README.md
app-icon-redesigned.png		app-icon-redesigned.png
app-icon.png		app-icon.png
config.docker.toml		config.docker.toml
config.toml		config.toml
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Izwi

Overview

Quick Install

macOS

Linux

Windows

Quick Start

1. Start the server

2. Download a model

3. Generate speech

4. Transcribe audio

Anonymous Analytics (Desktop)

Supported Models

Documentation

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Izwi

Overview

Quick Install

macOS

Linux

Windows

Quick Start

1. Start the server

2. Download a model

3. Generate speech

4. Transcribe audio

Anonymous Analytics (Desktop)

Supported Models

Documentation

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages