Feat/openrouter stt provider#24703
Open
xxxigm wants to merge 4 commits into
Open
Conversation
Adds openrouter alongside the existing five STT providers (local /
groq / openai / mistral / xai). Lets users with OPENROUTER_API_KEY
already configured for LLM access reuse the same key & billing for
voice transcription -- no separate Groq / OpenAI / xAI account
needed.
OpenRouter exposes POST /api/v1/audio/transcriptions with the same
wire format as OpenAI, so the OpenAI SDK with a custom base_url +
api_key is the cleanest route -- no new dependency. The new
_transcribe_openrouter() function mirrors the _transcribe_groq()
shape (same SDK call, same error envelope).
Wiring:
- Constants: DEFAULT_OPENROUTER_STT_MODEL ("openai/whisper-1"),
OPENROUTER_STT_BASE_URL (https://openrouter.ai/api/v1), and an
informational OPENROUTER_MODELS set used for documentation only
(we do NOT block other model IDs -- OpenRouter routinely adds
new ones).
- _get_provider() explicit branch for "openrouter" (requires
OPENROUTER_API_KEY + the openai SDK).
- Auto-detect chain extended: local > groq > openai > xai >
openrouter. openrouter is last in the cloud chain because it's
a catch-all -- if the user ALSO has a dedicated STT key we want
to honour their explicit preference for that provider's pricing
and routing.
- transcribe_audio() dispatch + per-provider model resolution
(stt.openrouter.model wins, default openai/whisper-1).
- Updated module docstring + the no-provider error hint.
Config:
- stt.openrouter section added to cli-config.yaml.example with
api_key / base_url overrides documented (both optional --
defaults to OPENROUTER_API_KEY env + the standard base URL).
No new Python dependency: the existing openai package handles
OpenRouter natively via base_url override.
18 unit tests across three classes that mock the OpenAI SDK and exercise the three observable surfaces of the new provider: TestGetProviderOpenRouter (7 tests): - explicit "openrouter" + key -> "openrouter" - explicit "openrouter" no key -> "none" (no silent fallback, mirrors the NousResearchGH-1774 contract for groq / openai) - explicit "openrouter" without the openai SDK installed -> "none" - auto-detect picks openrouter when only OPENROUTER_API_KEY is set - auto-detect prefers groq / openai / xai over openrouter (the precedence rule from the issue: openrouter is a catch-all, so any dedicated STT key wins) TestTranscribeOpenRouter (7 tests): - no key returns the documented error message - openai SDK missing returns the documented error message - happy path: env key + default base URL reach OpenAI(); model and response_format="text" pinned (whisper-1 must NOT be sent JSON) - stt.openrouter.api_key in config wins over the env var - stt.openrouter.base_url override is respected (trailing slash stripped to match the OpenAI SDK's URL builder) - APIError surface routes to "API error: ..." not the catch-all - PermissionError surfaces "Permission denied" cleanly TestTranscribeAudioDispatchOpenRouter (4 tests): - transcribe_audio routes to _transcribe_openrouter with the default model when stt.openrouter.model is unset - stt.openrouter.model override flows through - explicit model= argument wins over config - the no-provider error hint mentions OPENROUTER_API_KEY + OpenRouter so users discover it without reading source Mocks the openai package via patch.dict("sys.modules", ...) so the test runs on every CI box regardless of whether openai is installed. An autouse fixture clears every API-key env var before each test to keep cases hermetic.
…search#24415) Two end-to-end anchors that drive the exact contract from the issue's "Proposed Solution" section so a future refactor that drops any link in the chain fails with an explicit "NousResearch#24415 anchor" message. The five contract bullets from the issue: 1. stt.provider: openrouter is honoured. 2. OpenAI SDK called with base_url=https://openrouter.ai/api/v1. 3. Authorisation reuses OPENROUTER_API_KEY from env (the same env var the LLM stack reads). 4. Default model is openai/whisper-1. 5. Provider lives in the auto-detect chain so a user with ONLY OPENROUTER_API_KEY set still gets STT. Test 1 (test_full_user_round_trip_honours_issue_contract) asserts bullets 1-4 in a single end-to-end call. Each assertion carries its own NousResearch#24415 message identifying which contract bullet broke. Test 2 (test_auto_detect_lights_up_with_only_openrouter_key) asserts bullet 5: with only OPENROUTER_API_KEY set, no stt.provider, no other STT key, transcribe_audio() must still land on the openrouter provider. Verified on upstream/main: both fail with AssertionError: NousResearch#24415 anchor: stt.provider=openrouter with OPENROUTER_API_KEY set must succeed -- got error: 'No STT provider available...'. This means _get_provider does not recognise 'openrouter' or the dispatch in transcribe_audio is missing -- the feature is not wired. and AssertionError: NousResearch#24415 anchor: with only OPENROUTER_API_KEY set and no other STT backend available, auto-detect MUST land on the openrouter provider... -- exactly the "feature not implemented" signal we want. After the feature lands both pass.
Updates voice-mode.md so users discover the OpenRouter STT path without spelunking the source: - API Keys section: add OPENROUTER_API_KEY alongside the existing STT keys (also lists XAI_API_KEY which the section was missing). Adds an OpenRouter-specific tip explaining the "reuse your LLM key" value proposition + the auto-detect order rationale. - Troubleshooting bullet: extend the "verify STT is available" hint to mention OPENROUTER_API_KEY as a viable fallback. - Configuration Reference > Environment Variables: list OPENROUTER_API_KEY, STT_OPENROUTER_MODEL, and STT_OPENROUTER_BASE_URL alongside the existing STT_* overrides. Each change tagged with (NousResearch#24415) so future grep-archeology lands on the issue + this PR.
This was referenced May 14, 2026
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Implements #24415 — adds OpenRouter as a first-class STT provider alongside the existing five (
local,groq,openai,mistral,xai).Hermes today supports five STT providers but has no way to reuse
OPENROUTER_API_KEYfor transcription — users with OpenRouter set up for LLM access have to spin up a separate Groq / OpenAI / xAI account just to transcribe a voice message. OpenRouter exposes an OpenAI-compatible/api/v1/audio/transcriptionsendpoint that routes to Whisper-family models (openai/whisper-1,openai/gpt-4o-mini-transcribe,openai/gpt-4o-transcribe) billed through the same credit pool as LLM access. Addingopenrouteras a native provider closes that gap with zero new dependencies — the existingopenaiSDK with a custombase_urlhandles the wire format.The implementation lives in
tools/transcription_tools.py:stt.provider: openrouterbranch + auto-detect chain entry. Auto-detect order islocal > groq > openai > xai > openrouter. OpenRouter sits last in the cloud chain because it's a catch-all — if the user also has a dedicated STT key (Groq / OpenAI / xAI), the dedicated provider wins so they keep that provider's pricing and routing._transcribe_openrouter()mirrors the_transcribe_groq()shape:OpenAI(api_key=..., base_url="https://openrouter.ai/api/v1", timeout=30, max_retries=0)thenaudio.transcriptions.create(model=..., file=..., response_format="text"). Same error envelope, same_extract_transcript_textpost-processing.openai/whisper-1, overridable viastt.openrouter.modelor theSTT_OPENROUTER_MODELenv var. The known-models set is informational only — Hermes does not block other model IDs (OpenRouter routinely adds new ones).OPENROUTER_API_KEYfrom.env(no new env var).stt.openrouter.api_keyin config wins over the env var so managed deployments can pin per-deployment credentials.openaiSDK does the work; the issue's "no new SDK required" promise is honoured.Related Issue
Closes #24415 —
[Feature]: Add OpenRouter as an STT provider.Type of Change
Changes Made
tools/transcription_tools.py(+109 / −8):DEFAULT_OPENROUTER_STT_MODEL = "openai/whisper-1",OPENROUTER_STT_BASE_URL = "https://openrouter.ai/api/v1",OPENROUTER_MODELS = {…}(informational only)._get_provider()— explicitopenrouterbranch (requiresOPENROUTER_API_KEY+ theopenaiSDK) + auto-detect chain entry afterxai._transcribe_openrouter()— new function modeled on_transcribe_groq(), with config/env key resolution, base-url override, full error-envelope coverage (PermissionError,APIConnectionError,APITimeoutError,APIError,Exception).transcribe_audio()— new dispatch arm + per-provider model resolution (stt.openrouter.modelwins, defaultopenai/whisper-1).OPENROUTER_API_KEYso users discover it without reading source.cli-config.yaml.example(+4 / −2) — new commentedstt.openrouterblock withmodel/api_key/base_urlexamples.tests/tools/test_transcription_openrouter.py(new, +378 lines) — 18 unit tests across three classes:TestGetProviderOpenRouter(7) — explicit-config and auto-detect resolution, including the precedence rule from the issue (groq/openai/xaiwin overopenrouterin auto-detect).TestTranscribeOpenRouter(7) — env-key resolution, config overrides, OpenAI-SDK call shape (response_format="text"pinned), error-path wrappers.TestTranscribeAudioDispatchOpenRouter(4) — end-to-endtranscribe_audio()dispatch + the model-default chain.tests/tools/test_transcription_openrouter_feature_anchor.py(new, +208 lines) — 2 feature-shape regression anchors that drive the issue's full "Proposed Solution" contract end-to-end:test_full_user_round_trip_honours_issue_contract— asserts bullets 1–4 (provider routes, base URL, env key, default model) in a single round trip with per-bullet#24415 anchormessages.test_auto_detect_lights_up_with_only_openrouter_key— asserts bullet 5 (auto-detect with onlyOPENROUTER_API_KEYset must land onopenrouter).website/docs/user-guide/features/voice-mode.md(+15 / −3) — addedOPENROUTER_API_KEYto the API-Keys section with a tip explaining the "reuse your LLM key" value proposition + the auto-detect rationale; troubleshooting bullet updated; Configuration Reference section getsSTT_OPENROUTER_MODELandSTT_OPENROUTER_BASE_URLoverrides.No existing files outside the additions above are touched; existing provider code paths (
_transcribe_groq,_transcribe_openai,_transcribe_xai,_transcribe_mistral,_transcribe_local,_transcribe_local_command) are byte-identical.How to Test
.venvis set up:python3 -m venv .venv && source .venv/bin/activate && pip install -e ".[all,dev]"#24415 anchor: stt.provider=openrouter with OPENROUTER_API_KEY set must succeed -- got error: 'No STT provider available...'and
#24415 anchor: with only OPENROUTER_API_KEY set and no other STT backend available, auto-detect MUST land on the openrouter provider....Restore the fix (
git checkout HEAD -- tools/transcription_tools.py) → both pass.OPENROUTER_API_KEYinto~/.hermes/.env, setstt.provider: openrouterin~/.hermes/config.yaml, then send a voice message via the gateway:https://openrouter.ai/api/v1/audio/transcriptionsand replies. OpenRouter dashboard should show a hit onopenai/whisper-1.hermes→/voice on→ push-to-talk; transcript flows through the same code path.Checklist
Code
feat(stt): ...,test(stt): ...,docs(stt): ...)scripts/run_tests.shon the affected test files and all 147 tests passDocumentation & Housekeeping
website/docs/user-guide/features/voice-mode.mdgot an OpenRouter tip, troubleshooting hint, and env-var reference rowscli-config.yaml.exampleif I added/changed config keys — added a commentedstt.openrouterblockCONTRIBUTING.mdorAGENTS.mdif I changed architecture or workflows — N/AopenaiSDK +OPENROUTER_API_KEYenv var, both of which work identically on Windows / macOS / Linux. Tests are hermetic (mockedopenaimodule viapatch.dict("sys.modules", ...), no real network or filesystem outsidetmp_path).transcribe_audiopublic surface keeps the exact same signature and return shape)Screenshots / Logs
Feature-shape repro on
upstream/main(anchor proves the feature is genuinely missing pre-PR):feat/openrouter-stt-provider