Skip to content

feat(tts): add register_tts_provider() plugin hook (salvage of #30420)#31745

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-e3a590fb
May 25, 2026
Merged

feat(tts): add register_tts_provider() plugin hook (salvage of #30420)#31745
teknium1 merged 1 commit into
mainfrom
hermes/hermes-e3a590fb

Conversation

@teknium1

@teknium1 teknium1 commented May 25, 2026

Copy link
Copy Markdown
Contributor

Salvage of #30420 onto current main (was 195 commits behind). Closes #30398.

Summary

Adds ctx.register_tts_provider() plugin hook so TTS engines that don't fit the shell-template grammar (Python SDKs without CLI, streaming, voice-listing APIs, OAuth flows) can register a Python backend. Coexists with the existing tts.providers.<name>: type: command registry from #17843 — does not replace it.

Resolution order

  1. Built-in name → built-in dispatch (always wins)
  2. tts.providers.<name>: type: command → command-provider dispatch
  3. Plugin-registered TTSProvider → plugin dispatch (new)
  4. Unknown → Edge TTS fallback

Built-ins-always-win enforced at 3 layers (registry rejection + dispatcher short-circuit + picker filter). Command-providers-win-over-plugins enforced at 2 layers (caller + defensive re-check).

Changes (kshitijk4poor's, +2037/-1, 13 files)

  • agent/tts_provider.pyTTSProvider(ABC) mirroring image_gen_provider.py / video_gen_provider.py
  • agent/tts_registry.py — registry with _BUILTIN_NAMES reject-shadow invariant
  • hermes_cli/plugins.py (+38) — register_tts_provider() on PluginContext
  • tools/tts_tool.py (+148) — _dispatch_to_plugin_provider() + walrus-elif wiring, built-in elif chain untouched
  • hermes_cli/tools_config.py (+62) — _plugin_tts_providers() injects rows into Text-to-Speech picker
  • website/docs/user-guide/features/tts.md (+79) — new "Python plugin providers" section with decision table
  • website/docs/user-guide/features/plugins.md (+1) — TTS row mentions both surfaces
  • Tests: 95 unit (registry, dispatch, picker, plugin loading) + 9 subprocess parity scenarios

Validation

Result
14 TTS test files (265 tests) 265 passed, 0 failed
Parity harness vs main 9/9 OK (1 intentional [DIFF] for plugin-registered case)
E2E: _dispatch_to_plugin_provider real call written audio file, all 4 invariants verified (builtin short-circuit, command-wins, unknown→None, plugin dispatch writes path)
Built-in shadow attempt (name="edge") warning + silent ignore, get_provider("edge") is None

Context

This realizes the register_tts_provider() Teknium flagged as a future nice-to-have in commit a401f81 ("docs(plugins): correct TTS/STT pluggability"). The 10 built-in providers (edge, openai, elevenlabs, minimax, gemini, mistral, xai, piper, kittentts, neutts) stay inline — this PR does NOT migrate any of them.

Original PR: #30420
Contributor: @kshitijk4poor (also filed design issue #30398 themselves)

Infographic

register_tts_provider plugin hook

Adds a `TTSProvider(ABC)` + `register_tts_provider()` extension point
to the plugin context API, **alongside** the existing config-driven
`tts.providers.<name>: type: command` registry from PR #17843. This is
additive — the command-provider surface stays as the primary way to
add a TTS backend.

The hook covers cases the shell-template grammar can't reasonably
express:

- Native Python SDKs without a CLI (Cartesia, Fish Audio, etc.)
- Streaming synthesis (chunked Opus → voice-bubble delivery)
- Voice metadata API for the `hermes tools` picker
- OAuth-refreshing auth flows

None of the 10 inline built-in providers (`edge`, `openai`,
`elevenlabs`, `minimax`, `gemini`, `mistral`, `xai`, `piper`,
`kittentts`, `neutts`) are migrated to plugins. They stay inline. The
hook is for *new* engines that aren't built-in.

## Resolution order

The dispatcher's resolution order is the load-bearing invariant:

1. `tts.provider` is a built-in name → built-in dispatch. **Always wins.**
2. `tts.provider` matches `tts.providers.<name>` with `command:` set
   → command-provider dispatch (PR #17843).
3. `tts.provider` matches a plugin-registered `TTSProvider`
   → plugin dispatch (new).
4. No match → falls through to Edge TTS default (legacy behavior).

Built-ins-always-win is enforced at THREE layers:
- Registry: `register_provider()` rejects shadowing names with a warning.
- Dispatcher: `_dispatch_to_plugin_provider()` short-circuits built-in
  names defensively before consulting the registry.
- Picker: `_plugin_tts_providers()` filters built-in shadows out of
  the `hermes tools` row list defensively.

Command-providers-win-over-plugins is enforced at TWO layers:
- The caller in `text_to_speech_tool` checks
  `_resolve_command_provider_config` first.
- `_dispatch_to_plugin_provider` re-checks for a same-name command
  config defensively so a refactor of the caller can't silently break
  the invariant.

## New files

- `agent/tts_provider.py` — `TTSProvider(ABC)` with `synthesize()` (required),
  `list_voices()`, `list_models()`, `get_setup_schema()`, `stream()`,
  `voice_compatible` (all optional with sane defaults). Mirrors
  `agent/image_gen_provider.py` shape.
- `agent/tts_registry.py` — `register_provider`/`get_provider`/`list_providers`
  with `_BUILTIN_NAMES` reject-shadowing invariant. Mirrors
  `agent/image_gen_registry.py` shape.
- `plugins/tts/...` directory ready for community plugins (none shipped).

## Modified files

- `hermes_cli/plugins.py` — `register_tts_provider()` method on
  `PluginContext`. Matches the gating shape of
  `register_image_gen_provider()` / `register_browser_provider()`.
- `tools/tts_tool.py` — `_dispatch_to_plugin_provider()` +
  `_plugin_provider_is_voice_compatible()` + walrus-elif wiring into
  the main dispatcher. Built-in elif chain untouched.
- `hermes_cli/tools_config.py` — `_plugin_tts_providers()` injects
  plugin rows into the Text-to-Speech picker category alongside the
  10 hardcoded built-in rows.

## Tests

- `tests/agent/test_tts_registry.py` — 47 tests covering registration,
  lookup, ABC contract, helpers, AND a `TestBuiltinSync` regression
  test that fails if `agent.tts_registry._BUILTIN_NAMES` drifts from
  `tools.tts_tool.BUILTIN_TTS_PROVIDERS` (kept duplicated due to
  circular import constraints).
- `tests/tools/test_tts_plugin_dispatch.py` — 35 tests covering
  built-in-always-wins, command-wins-over-plugin, plugin dispatch,
  exception passthrough, voice_compatible helper.
- `tests/hermes_cli/test_tts_picker.py` — 10 tests covering the
  picker surface, builtin shadowing defense, integration with
  `_visible_providers`.
- `tests/hermes_cli/test_plugins_tts_registration.py` — 3 end-to-end
  tests via `PluginManager.discover_and_load()`.
- `tests/plugins/tts/check_parity_vs_main.py` — 9-scenario subprocess
  parity harness vs `origin/main`. The only intentional diff is
  `fallback_edge → plugin` for the `plugin-installed` scenario.

## Verification

- 95/95 new tests pass.
- 170/170 pre-existing TTS tests (test_tts_command_providers,
  test_tts_max_text_length, test_tts_speed, etc.) pass unchanged.
- Parity harness against `origin/main`: 8 OK + 1 expected DIFF.
- E2E smoke: a registered plugin's `synthesize()` is called via
  `text_to_speech_tool` with the standard JSON envelope returned.
- Ruff clean on all touched files.

## Docs

- `website/docs/user-guide/features/tts.md` — new "Python plugin
  providers" section with a decision table (command-provider vs
  plugin), minimal plugin example, and the optional-hook reference.
- `website/docs/user-guide/features/plugins.md` — TTS row updated to
  mention both surfaces (command-provider primary, plugin for
  SDK/streaming).

Closes #30398
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-e3a590fb vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9114 on HEAD, 9103 on base (🆕 +11)

🆕 New issues (10):

Rule Count
invalid-argument-type 5
unresolved-import 3
invalid-assignment 1
not-subscriptable 1
First entries
tests/agent/test_tts_registry.py:21: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/hermes_cli/test_tts_picker.py:11: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/agent/test_tts_registry.py:147: [invalid-argument-type] invalid-argument-type: Argument to function `get_provider` is incorrect: Expected `str`, found `Literal[123]`
tests/agent/test_tts_registry.py:282: [invalid-argument-type] invalid-argument-type: Argument to function `resolve_output_format` is incorrect: Expected `str | None`, found `Literal[123]`
tests/hermes_cli/test_tts_picker.py:105: [invalid-assignment] invalid-assignment: Invalid subscript assignment with key of type `Literal["bogus"]` and value of type `_NoName` on object of type `dict[str, TTSProvider]`
tests/tools/test_tts_plugin_dispatch.py:213: [not-subscriptable] not-subscriptable: Cannot subscript object of type `None` with no `__getitem__` method
tests/agent/test_tts_registry.py:283: [invalid-argument-type] invalid-argument-type: Argument to function `resolve_output_format` is incorrect: Expected `str | None`, found `list[Unknown]`
tests/agent/test_tts_registry.py:84: [invalid-argument-type] invalid-argument-type: Argument to function `register_provider` is incorrect: Expected `TTSProvider`, found `Literal["not a provider"]`
tests/agent/test_tts_registry.py:146: [invalid-argument-type] invalid-argument-type: Argument to function `get_provider` is incorrect: Expected `str`, found `None`
tests/tools/test_tts_plugin_dispatch.py:33: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`

✅ Fixed issues: none

Unchanged: 4850 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@alt-glitch alt-glitch added type/feature New feature or request comp/plugins Plugin system and bundled plugins tool/tts Text-to-speech and transcription P3 Low — cosmetic, nice to have labels May 25, 2026
@teknium1 teknium1 merged commit 00ec0b6 into main May 25, 2026
34 of 35 checks passed
@teknium1 teknium1 deleted the hermes/hermes-e3a590fb branch May 25, 2026 01:04
teknium1 added a commit that referenced this pull request May 27, 2026
)

Background processes whose command contains `gh pr view --json
statusCheckRollup` or `gh pr checks | jq` now get a runtime hint in
the result pointing at the canonical green-ci-policy snippets. The
homebrew shape has caused at least seven silent CI-watcher failures
in the past two weeks (#31329, #31448, #31695, #31709, #31745,
#32264, #33131) — each one a different jq/awk/grep variation of the
same fundamental problem (stdout buffering, jq null-key edge cases,
conclusion-vs-status confusion, TTY-only banner grepping).

The skill that documents this anti-pattern is excellent, but a skill
only fires if the agent loads it. The tool surface fires on every
misuse. This is the embed-footguns-in-tool-surface pattern from
PR #31289 applied to a recurring failure mode that's outgrown
skill-only enforcement.

Detector is deliberately narrow — flags two specific shapes:

  1. Any command containing `statusCheckRollup` (the JSON-API path —
     conclusion vs status field semantics keep burning us).
  2. `gh pr view` / `gh pr checks` combined with `jq` (gh pr
     checks doesn't emit JSON, so any `| jq` here is confused intent;
     the canonical column-2 poller uses awk-on-tabs, not jq).

Does NOT flag the blessed column-2 awk-on-tabs poller (which uses
`awk -F"\t" "\==\"pending\""`) or the exit-code-driven
`gh pr checks $PR >/dev/null` snippet.

Hint composes with the existing background-without-notify_on_complete
hint — both can fire on the same call. Each is independently
actionable.

Tests:
- 4 new cases in tests/tools/test_notify_on_complete.py
- test_homebrew_ci_poller_via_statusCheckRollup_emits_hint (positive)
- test_homebrew_ci_poller_via_gh_pr_checks_piped_to_jq_emits_hint (positive)
- test_canonical_column2_awk_poller_does_not_emit_homebrew_hint (negative)
- test_canonical_gh_pr_checks_exit_code_loop_does_not_emit_hint (negative)
- test_non_ci_background_command_does_not_emit_homebrew_hint (negative)
- 30/30 passing (was 26)
mathias3 pushed a commit to mathias3/hermes-agent that referenced this pull request May 28, 2026
…sResearch#33142)

Background processes whose command contains `gh pr view --json
statusCheckRollup` or `gh pr checks | jq` now get a runtime hint in
the result pointing at the canonical green-ci-policy snippets. The
homebrew shape has caused at least seven silent CI-watcher failures
in the past two weeks (NousResearch#31329, NousResearch#31448, NousResearch#31695, NousResearch#31709, NousResearch#31745,
NousResearch#32264, NousResearch#33131) — each one a different jq/awk/grep variation of the
same fundamental problem (stdout buffering, jq null-key edge cases,
conclusion-vs-status confusion, TTY-only banner grepping).

The skill that documents this anti-pattern is excellent, but a skill
only fires if the agent loads it. The tool surface fires on every
misuse. This is the embed-footguns-in-tool-surface pattern from
PR NousResearch#31289 applied to a recurring failure mode that's outgrown
skill-only enforcement.

Detector is deliberately narrow — flags two specific shapes:

  1. Any command containing `statusCheckRollup` (the JSON-API path —
     conclusion vs status field semantics keep burning us).
  2. `gh pr view` / `gh pr checks` combined with `jq` (gh pr
     checks doesn't emit JSON, so any `| jq` here is confused intent;
     the canonical column-2 poller uses awk-on-tabs, not jq).

Does NOT flag the blessed column-2 awk-on-tabs poller (which uses
`awk -F"\t" "\==\"pending\""`) or the exit-code-driven
`gh pr checks $PR >/dev/null` snippet.

Hint composes with the existing background-without-notify_on_complete
hint — both can fire on the same call. Each is independently
actionable.

Tests:
- 4 new cases in tests/tools/test_notify_on_complete.py
- test_homebrew_ci_poller_via_statusCheckRollup_emits_hint (positive)
- test_homebrew_ci_poller_via_gh_pr_checks_piped_to_jq_emits_hint (positive)
- test_canonical_column2_awk_poller_does_not_emit_homebrew_hint (negative)
- test_canonical_gh_pr_checks_exit_code_loop_does_not_emit_hint (negative)
- test_non_ci_background_command_does_not_emit_homebrew_hint (negative)
- 30/30 passing (was 26)
Bryce-huang pushed a commit to wbkunlun/hermes-agent that referenced this pull request May 29, 2026
…sResearch#33142)

Background processes whose command contains `gh pr view --json
statusCheckRollup` or `gh pr checks | jq` now get a runtime hint in
the result pointing at the canonical green-ci-policy snippets. The
homebrew shape has caused at least seven silent CI-watcher failures
in the past two weeks (NousResearch#31329, NousResearch#31448, NousResearch#31695, NousResearch#31709, NousResearch#31745,
NousResearch#32264, NousResearch#33131) — each one a different jq/awk/grep variation of the
same fundamental problem (stdout buffering, jq null-key edge cases,
conclusion-vs-status confusion, TTY-only banner grepping).

The skill that documents this anti-pattern is excellent, but a skill
only fires if the agent loads it. The tool surface fires on every
misuse. This is the embed-footguns-in-tool-surface pattern from
PR NousResearch#31289 applied to a recurring failure mode that's outgrown
skill-only enforcement.

Detector is deliberately narrow — flags two specific shapes:

  1. Any command containing `statusCheckRollup` (the JSON-API path —
     conclusion vs status field semantics keep burning us).
  2. `gh pr view` / `gh pr checks` combined with `jq` (gh pr
     checks doesn't emit JSON, so any `| jq` here is confused intent;
     the canonical column-2 poller uses awk-on-tabs, not jq).

Does NOT flag the blessed column-2 awk-on-tabs poller (which uses
`awk -F"\t" "\==\"pending\""`) or the exit-code-driven
`gh pr checks $PR >/dev/null` snippet.

Hint composes with the existing background-without-notify_on_complete
hint — both can fire on the same call. Each is independently
actionable.

Tests:
- 4 new cases in tests/tools/test_notify_on_complete.py
- test_homebrew_ci_poller_via_statusCheckRollup_emits_hint (positive)
- test_homebrew_ci_poller_via_gh_pr_checks_piped_to_jq_emits_hint (positive)
- test_canonical_column2_awk_poller_does_not_emit_homebrew_hint (negative)
- test_canonical_gh_pr_checks_exit_code_loop_does_not_emit_hint (negative)
- test_non_ci_background_command_does_not_emit_homebrew_hint (negative)
- 30/30 passing (was 26)
#AI commit#
mosaiq-systems pushed a commit to mosaiq-systems/hermes-agent that referenced this pull request May 29, 2026
…sResearch#33142)

Background processes whose command contains `gh pr view --json
statusCheckRollup` or `gh pr checks | jq` now get a runtime hint in
the result pointing at the canonical green-ci-policy snippets. The
homebrew shape has caused at least seven silent CI-watcher failures
in the past two weeks (NousResearch#31329, NousResearch#31448, NousResearch#31695, NousResearch#31709, NousResearch#31745,
NousResearch#32264, NousResearch#33131) — each one a different jq/awk/grep variation of the
same fundamental problem (stdout buffering, jq null-key edge cases,
conclusion-vs-status confusion, TTY-only banner grepping).

The skill that documents this anti-pattern is excellent, but a skill
only fires if the agent loads it. The tool surface fires on every
misuse. This is the embed-footguns-in-tool-surface pattern from
PR NousResearch#31289 applied to a recurring failure mode that's outgrown
skill-only enforcement.

Detector is deliberately narrow — flags two specific shapes:

  1. Any command containing `statusCheckRollup` (the JSON-API path —
     conclusion vs status field semantics keep burning us).
  2. `gh pr view` / `gh pr checks` combined with `jq` (gh pr
     checks doesn't emit JSON, so any `| jq` here is confused intent;
     the canonical column-2 poller uses awk-on-tabs, not jq).

Does NOT flag the blessed column-2 awk-on-tabs poller (which uses
`awk -F"\t" "\==\"pending\""`) or the exit-code-driven
`gh pr checks $PR >/dev/null` snippet.

Hint composes with the existing background-without-notify_on_complete
hint — both can fire on the same call. Each is independently
actionable.

Tests:
- 4 new cases in tests/tools/test_notify_on_complete.py
- test_homebrew_ci_poller_via_statusCheckRollup_emits_hint (positive)
- test_homebrew_ci_poller_via_gh_pr_checks_piped_to_jq_emits_hint (positive)
- test_canonical_column2_awk_poller_does_not_emit_homebrew_hint (negative)
- test_canonical_gh_pr_checks_exit_code_loop_does_not_emit_hint (negative)
- test_non_ci_background_command_does_not_emit_homebrew_hint (negative)
- 30/30 passing (was 26)
gweeteve pushed a commit to gweeteve/hermes-agent that referenced this pull request Jun 2, 2026
…sResearch#33142)

Background processes whose command contains `gh pr view --json
statusCheckRollup` or `gh pr checks | jq` now get a runtime hint in
the result pointing at the canonical green-ci-policy snippets. The
homebrew shape has caused at least seven silent CI-watcher failures
in the past two weeks (NousResearch#31329, NousResearch#31448, NousResearch#31695, NousResearch#31709, NousResearch#31745,
NousResearch#32264, NousResearch#33131) — each one a different jq/awk/grep variation of the
same fundamental problem (stdout buffering, jq null-key edge cases,
conclusion-vs-status confusion, TTY-only banner grepping).

The skill that documents this anti-pattern is excellent, but a skill
only fires if the agent loads it. The tool surface fires on every
misuse. This is the embed-footguns-in-tool-surface pattern from
PR NousResearch#31289 applied to a recurring failure mode that's outgrown
skill-only enforcement.

Detector is deliberately narrow — flags two specific shapes:

  1. Any command containing `statusCheckRollup` (the JSON-API path —
     conclusion vs status field semantics keep burning us).
  2. `gh pr view` / `gh pr checks` combined with `jq` (gh pr
     checks doesn't emit JSON, so any `| jq` here is confused intent;
     the canonical column-2 poller uses awk-on-tabs, not jq).

Does NOT flag the blessed column-2 awk-on-tabs poller (which uses
`awk -F"\t" "\==\"pending\""`) or the exit-code-driven
`gh pr checks $PR >/dev/null` snippet.

Hint composes with the existing background-without-notify_on_complete
hint — both can fire on the same call. Each is independently
actionable.

Tests:
- 4 new cases in tests/tools/test_notify_on_complete.py
- test_homebrew_ci_poller_via_statusCheckRollup_emits_hint (positive)
- test_homebrew_ci_poller_via_gh_pr_checks_piped_to_jq_emits_hint (positive)
- test_canonical_column2_awk_poller_does_not_emit_homebrew_hint (negative)
- test_canonical_gh_pr_checks_exit_code_loop_does_not_emit_hint (negative)
- test_non_ci_background_command_does_not_emit_homebrew_hint (negative)
- 30/30 passing (was 26)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/plugins Plugin system and bundled plugins P3 Low — cosmetic, nice to have tool/tts Text-to-speech and transcription type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add register_tts_provider() plugin hook for Python-SDK and streaming TTS engines

3 participants