GitHub - sergiopesch/voco: A voice-native interface layer designed for speed and precision

VOCO is a local-first Linux dictation app. Press a hotkey, speak, press it again, and VOCO types at your cursor.

VOCO can also run as an optional voice bridge for OpenClaw: speak locally, let VOCO transcribe on-device, then send the transcript to a configured OpenClaw CLI agent and type the agent's answer at your cursor.

For low-latency back-and-forth voice, VOCO also has an opt-in realtime conversation toggle. It keeps the OpenAI API key in the local Tauri backend, mints a short-lived Realtime token, and streams 24 kHz PCM audio over a WebSocket so the Linux WebView does not depend on WebRTC support.

Install

Recommended:

wget https://raw.githubusercontent.com/sergiopesch/voco/voco.2026.0.16/install -O voco-install
chmod +x voco-install
./voco-install

Optional:

less ./voco-install

Manual .deb fallback:

wget -O voco_latest_amd64.deb https://github.com/sergiopesch/voco/releases/latest/download/voco_latest_amd64.deb
sudo dpkg -i voco_latest_amd64.deb

Primary tested path: Ubuntu and Debian.

Try It

Launch VOCO from your app menu or run voco.
Finish the short setup.
Press Alt+D.
Speak.
Press Alt+D again.
Confirm the text is inserted at your cursor.

To use OpenClaw mode, open Settings -> Output, choose Ask OpenClaw and type answer or Ask OpenClaw and speak answer, and keep the OpenClaw gateway/agent available from your shell environment. Spoken answers also require OpenClaw TTS and ffplay.

To use realtime conversation, store OPENAI_API_KEY=... in ~/.openclaw/realtime.env, then press Alt+R or open the VOCO popover and press Start realtime. Press Alt+R again or press Stop realtime to end the session. While realtime is active, the VOCO mic visual appears in the popover or hidden overlay and follows both your microphone level and the assistant's spoken response level.

Detailed realtime behavior, first-toggle guarantees, diagnostics, and QA criteria are defined in docs/realtime-conversation-spec.md.

Requirements

Ubuntu or Debian
PulseAudio or PipeWire
Wayland: ydotool, wl-clipboard, and access to the input group for the most reliable hotkey path
X11: xdotool and xclip

Run From Source

git clone https://github.com/sergiopesch/voco.git
cd voco
npm install
./scripts/setup.sh --install
npm run dev

Useful Checks

npm run check
npm run lint
npm test
cargo test --manifest-path apps/desktop/src-tauri/Cargo.toml
npm run rehearse:release
npm run report:linux-runtime

More Help

Notes

First launch downloads the speech model once.
Single dictation recordings are currently capped at 60 seconds.
On Wayland, Alt+D and Alt+Shift+D are the most reliable hotkeys right now.
Realtime conversation uses Alt+R.
The realtime VOCO mic animation is driven by live input and output audio levels.
Wayland text insertion depends on ydotool, compositor support, and often input group access.
OpenClaw mode is opt-in and requires the openclaw CLI to be available in PATH.
Realtime conversation is opt-in and requires OPENAI_API_KEY in the environment or ~/.openclaw/realtime.env.
Config lives at ~/.config/voco/config.json.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
.github/workflows		.github/workflows
apps/desktop		apps/desktop
assets		assets
docs		docs
packaging/flatpak		packaging/flatpak
scripts		scripts
snap		snap
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
install		install
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install

Try It

Requirements

Run From Source

Useful Checks

More Help

Notes

License

About

Uh oh!

Releases 15

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Install

Try It

Requirements

Run From Source

Useful Checks

More Help

Notes

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 15

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages