feat(tts): add Piper as a native local TTS provider (closes #8508) by teknium1 · Pull Request #17885 · NousResearch/hermes-agent

teknium1 · 2026-04-30T09:52:21Z

Summary

Closes #8508. Adds Piper (OHF-Voice/piper1-gpl) as a native local TTS provider — fast neural TTS with 44 languages, zero API keys. One-keystroke install via hermes tools.

What a user does

hermes tools  →  Voice & TTS  →  Piper

Hermes runs pip install piper-tts. Then:

tts:
  provider: piper
  piper:
    voice: en_US-lessac-medium

On the first TTS call, Hermes downloads the voice (~20-90MB depending on quality tier) into ~/.hermes/cache/piper-voices/ and reuses it forever after. Voice catalog: 44 languages at OHF-Voice/piper1-gpl VOICES.md.

Changes

File	Purpose
`tools/tts_tool.py`	New `_import_piper`, `_check_piper_available`, `_resolve_piper_voice_path` (path OR voice name with auto-download), `_generate_piper_tts` (module-cached voice, optional `SynthesisConfig` knobs). Wired into `text_to_speech_tool`, `check_tts_requirements`, and the ffmpeg-opus conversion branch. `piper` added to `BUILTIN_TTS_PROVIDERS` and `PROVIDER_MAX_TEXT_LENGTH`.
`hermes_cli/tools_config.py`	Piper provider entry + `post_setup` handler that runs `pip install piper-tts` and prints voice-catalog URL.
`hermes_cli/config.py`	`tts.piper.voice = en_US-lessac-medium` default + commented advanced knobs.
`website/docs/user-guide/features/tts.md`	New dedicated Piper (local, 44 languages) section; ten-provider table; ffmpeg note.
`website/docs/user-guide/features/overview.md`	Bumped count to ten native providers.
`tests/tools/test_tts_piper.py`	16 new tests — registration, voice-path resolution branches (direct path / cached / fresh download / failure / empty → default), voice cache reuse, advanced knobs, end-to-end dispatch, missing-package error, `check_tts_requirements` toggle, shadow-protection regression guard.

E2E verification (live, on-machine)

pip install piper-tts (inside repo venv)
Config: tts.provider: piper, tts.piper.voice: en_US-lessac-low (~14MB tier)
text_to_speech_tool(text="Hello from Piper in Hermes Agent!") → voice auto-downloaded from HuggingFace, WAV synthesized, ffmpeg-converted to Ogg/Opus (valid OggS header).
Second call: 60ms, hits voice cache.
Cache dir confirmed populated with en_US-lessac-low.onnx + en_US-lessac-low.onnx.json.

E2E caught a real bug: the first pass used -d as the download-dir flag, but python -m piper.download_voices actually wants --download-dir. Fixed before opening this PR. Unit tests alone would not have caught it.

Why native instead of command provider

Native Piper addresses what #8508 actually asked for: broader local-language coverage that just works from hermes setup. The shell command-provider path (landed in #17843) still exists for custom workflows — users with their own trained .onnx models can set tts.piper.voice to an absolute path, or go full command-provider for non-standard wrappers. Both paths documented.

Validation

New suite: tests/tools/test_tts_piper.py — 16 passed
Full TTS + tools_config + setup + gateway-base targeted suite — 315 passed, zero regressions
Live E2E — voice download + synth + cache reuse confirmed on real machine

Relation to #17211 / #17843

This is PR 3 in the three-part redesign:

feat(gateway): centralize audio routing + FLAC support + Telegram doc fallback #17833 (merged) — audio routing cleanup extracted from @versun's feat(tts): add local_command provider #17211
feat(tts): add command-type provider registry under tts.providers.<name> #17843 (merged) — command-provider registry reshape of feat(tts): add local_command provider #17211
This PR — native Piper provider, closes [Feature]: Improve built-in local TTS language coverage with Piper #8508

Together these make Hermes first-class for both casual users ("I want Piper working") and power users ("I want to wire in my custom CLI").

Piper (OHF-Voice/piper1-gpl) is a fast, local neural TTS engine from the Home Assistant project that supports 44 languages with zero API keys. Adds it as a native built-in provider alongside edge/neutts/kittentts, installable via 'hermes tools' with one keystroke. What ships: - New 'piper' built-in provider in tools/tts_tool.py - Lazy import via _import_piper() - Module-level voice cache keyed on (model_path, use_cuda) so switching voices doesn't invalidate older cached voices - _resolve_piper_voice_path() accepts either an absolute .onnx path or a voice name (auto-downloaded on first use via 'python -m piper.download_voices --download-dir <cache>') - Voice cache at ~/.hermes/cache/piper-voices/ (profile-aware via get_hermes_dir) - Optional SynthesisConfig knobs: length_scale, noise_scale, noise_w_scale, volume, normalize_audio, use_cuda — passed through only when configured, so older piper-tts versions aren't broken - WAV output then ffmpeg conversion path (same as neutts/kittentts) so Telegram voice bubbles work when ffmpeg is present - Piper added to BUILTIN_TTS_PROVIDERS so a user's tts.providers.piper.command cannot shadow the native provider (regression test included) - 'hermes tools' wizard entry - Piper appears under Voice and TTS as local free, with 'pip install piper-tts' auto-install via post_setup handler - Prints voice-catalog URL and default-voice info after install - config.yaml defaults - tts.piper.voice defaults to en_US-lessac-medium - Commented advanced knobs for discoverability - Docs - New 'Piper (local, 44 languages)' section in features/tts.md explaining install path, voice switching, pre-downloaded voices, and advanced knobs - Piper listed in the ten-provider table and ffmpeg table - Custom-command-providers section updated to drop the Piper example (now native) and add a piper-custom example for users with their own trained .onnx models - overview.md bumps provider count to ten - Tests (tests/tools/test_tts_piper.py, 16 tests) - Registration (BUILTIN_TTS_PROVIDERS, PROVIDER_MAX_TEXT_LENGTH) - _resolve_piper_voice_path across every branch: direct .onnx path, cached voice name, fresh download with correct CLI args, download failure, successful-exit-but-missing-files, empty voice to default - _generate_piper_tts: loads voice once, reuses cache, voice-name download wiring, advanced knobs flow through SynthesisConfig - text_to_speech_tool end-to-end dispatch and missing-package error - check_tts_requirements: piper availability toggles the return value - Regression guard: piper cannot be shadowed by a command provider with the same name - Pre-existing test_tts_mistral test broadened to mock the new piper/kittentts/command-provider checks (otherwise it false-passes when piper is installed in the test venv) E2E verification (live): Actual pip install piper-tts, config piper + en_US-lessac-low, text_to_speech_tool call, voice auto-downloaded from HuggingFace, WAV synthesized, ffmpeg-converted to Ogg/Opus. Second call hits the cache (~60ms). Cache dir populated with .onnx and .onnx.json. This caught a real bug during development: the first pass used '-d' as the download-dir flag; the actual piper.download_voices CLI wants '--download-dir'. Fixed before PR opened.

…17885) Piper (OHF-Voice/piper1-gpl) is a fast, local neural TTS engine from the Home Assistant project that supports 44 languages with zero API keys. Adds it as a native built-in provider alongside edge/neutts/kittentts, installable via 'hermes tools' with one keystroke. What ships: - New 'piper' built-in provider in tools/tts_tool.py - Lazy import via _import_piper() - Module-level voice cache keyed on (model_path, use_cuda) so switching voices doesn't invalidate older cached voices - _resolve_piper_voice_path() accepts either an absolute .onnx path or a voice name (auto-downloaded on first use via 'python -m piper.download_voices --download-dir <cache>') - Voice cache at ~/.hermes/cache/piper-voices/ (profile-aware via get_hermes_dir) - Optional SynthesisConfig knobs: length_scale, noise_scale, noise_w_scale, volume, normalize_audio, use_cuda — passed through only when configured, so older piper-tts versions aren't broken - WAV output then ffmpeg conversion path (same as neutts/kittentts) so Telegram voice bubbles work when ffmpeg is present - Piper added to BUILTIN_TTS_PROVIDERS so a user's tts.providers.piper.command cannot shadow the native provider (regression test included) - 'hermes tools' wizard entry - Piper appears under Voice and TTS as local free, with 'pip install piper-tts' auto-install via post_setup handler - Prints voice-catalog URL and default-voice info after install - config.yaml defaults - tts.piper.voice defaults to en_US-lessac-medium - Commented advanced knobs for discoverability - Docs - New 'Piper (local, 44 languages)' section in features/tts.md explaining install path, voice switching, pre-downloaded voices, and advanced knobs - Piper listed in the ten-provider table and ffmpeg table - Custom-command-providers section updated to drop the Piper example (now native) and add a piper-custom example for users with their own trained .onnx models - overview.md bumps provider count to ten - Tests (tests/tools/test_tts_piper.py, 16 tests) - Registration (BUILTIN_TTS_PROVIDERS, PROVIDER_MAX_TEXT_LENGTH) - _resolve_piper_voice_path across every branch: direct .onnx path, cached voice name, fresh download with correct CLI args, download failure, successful-exit-but-missing-files, empty voice to default - _generate_piper_tts: loads voice once, reuses cache, voice-name download wiring, advanced knobs flow through SynthesisConfig - text_to_speech_tool end-to-end dispatch and missing-package error - check_tts_requirements: piper availability toggles the return value - Regression guard: piper cannot be shadowed by a command provider with the same name - Pre-existing test_tts_mistral test broadened to mock the new piper/kittentts/command-provider checks (otherwise it false-passes when piper is installed in the test venv) E2E verification (live): Actual pip install piper-tts, config piper + en_US-lessac-low, text_to_speech_tool call, voice auto-downloaded from HuggingFace, WAV synthesized, ffmpeg-converted to Ogg/Opus. Second call hits the cache (~60ms). Cache dir populated with .onnx and .onnx.json. This caught a real bug during development: the first pass used '-d' as the download-dir flag; the actual piper.download_voices CLI wants '--download-dir'. Fixed before PR opened.

…rch#8508) (NousResearch#17885) Piper (OHF-Voice/piper1-gpl) is a fast, local neural TTS engine from the Home Assistant project that supports 44 languages with zero API keys. Adds it as a native built-in provider alongside edge/neutts/kittentts, installable via 'hermes tools' with one keystroke. What ships: - New 'piper' built-in provider in tools/tts_tool.py - Lazy import via _import_piper() - Module-level voice cache keyed on (model_path, use_cuda) so switching voices doesn't invalidate older cached voices - _resolve_piper_voice_path() accepts either an absolute .onnx path or a voice name (auto-downloaded on first use via 'python -m piper.download_voices --download-dir <cache>') - Voice cache at ~/.hermes/cache/piper-voices/ (profile-aware via get_hermes_dir) - Optional SynthesisConfig knobs: length_scale, noise_scale, noise_w_scale, volume, normalize_audio, use_cuda — passed through only when configured, so older piper-tts versions aren't broken - WAV output then ffmpeg conversion path (same as neutts/kittentts) so Telegram voice bubbles work when ffmpeg is present - Piper added to BUILTIN_TTS_PROVIDERS so a user's tts.providers.piper.command cannot shadow the native provider (regression test included) - 'hermes tools' wizard entry - Piper appears under Voice and TTS as local free, with 'pip install piper-tts' auto-install via post_setup handler - Prints voice-catalog URL and default-voice info after install - config.yaml defaults - tts.piper.voice defaults to en_US-lessac-medium - Commented advanced knobs for discoverability - Docs - New 'Piper (local, 44 languages)' section in features/tts.md explaining install path, voice switching, pre-downloaded voices, and advanced knobs - Piper listed in the ten-provider table and ffmpeg table - Custom-command-providers section updated to drop the Piper example (now native) and add a piper-custom example for users with their own trained .onnx models - overview.md bumps provider count to ten - Tests (tests/tools/test_tts_piper.py, 16 tests) - Registration (BUILTIN_TTS_PROVIDERS, PROVIDER_MAX_TEXT_LENGTH) - _resolve_piper_voice_path across every branch: direct .onnx path, cached voice name, fresh download with correct CLI args, download failure, successful-exit-but-missing-files, empty voice to default - _generate_piper_tts: loads voice once, reuses cache, voice-name download wiring, advanced knobs flow through SynthesisConfig - text_to_speech_tool end-to-end dispatch and missing-package error - check_tts_requirements: piper availability toggles the return value - Regression guard: piper cannot be shadowed by a command provider with the same name - Pre-existing test_tts_mistral test broadened to mock the new piper/kittentts/command-provider checks (otherwise it false-passes when piper is installed in the test venv) E2E verification (live): Actual pip install piper-tts, config piper + en_US-lessac-low, text_to_speech_tool call, voice auto-downloaded from HuggingFace, WAV synthesized, ffmpeg-converted to Ogg/Opus. Second call hits the cache (~60ms). Cache dir populated with .onnx and .onnx.json. This caught a real bug during development: the first pass used '-d' as the download-dir flag; the actual piper.download_voices CLI wants '--download-dir'. Fixed before PR opened.

teknium1 merged commit 8d302e3 into main Apr 30, 2026
13 checks passed

teknium1 deleted the feat/tts-piper branch April 30, 2026 09:53

This was referenced Apr 30, 2026

feat(tts): add local_command provider #17211

Closed

[Feature]: Add VoxCPM2 as an optional local TTS provider via external helper #11688

Open

[Feature]: Improve built-in local TTS language coverage with Piper #8508

Closed

alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have tool/tts Text-to-speech and transcription comp/cli CLI entry point, hermes_cli/, setup wizard labels Apr 30, 2026

szafranski mentioned this pull request Apr 30, 2026

feat(tts): add Piper setup support and fix Telegram voice note metadata #7815

Open

19 tasks

github-actions Bot mentioned this pull request May 1, 2026

chore: bump NousResearch/hermes-agent version from v2026.4.23 to v2026.4.30 Docker-Hub-sirmark/docker-hermes-agent#4

Merged

subinium mentioned this pull request May 11, 2026

feat(tools): TTS provider registry + Piper local TTS (Hermes v0.12 parity) subinium/CrowClaw#325

Closed

4 tasks

szafranski mentioned this pull request May 30, 2026

[Bug]: piper missing from hermes setup tts menu — TTS provider list drift across CLI surfaces #35439

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tts): add Piper as a native local TTS provider (closes #8508)#17885

feat(tts): add Piper as a native local TTS provider (closes #8508)#17885
teknium1 merged 1 commit into
mainfrom
feat/tts-piper

teknium1 commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

teknium1 commented Apr 30, 2026

Summary

What a user does

Changes

E2E verification (live, on-machine)

Why native instead of command provider

Validation

Relation to #17211 / #17843

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants