You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hermes today supports five STT providers: local (faster-whisper), Groq, OpenAI, Mistral (Voxtral), and xAI (Grok). Users who already have OPENROUTER_API_KEY set up for LLM access cannot use the same key & auth for speech-to-text — they'd need a separate Groq, OpenAI, or xAI account to transcribe voice messages.
OpenRouter offers an OpenAI-compatible /api/v1/audio/transcriptions endpoint that routes to Whisper-family models (openai/whisper-1, openai/gpt-4o-mini-transcribe, etc.), billed through the same OpenRouter credit pool. Adding openrouter as an STT provider means users with an OpenRouter key get voice transcription with zero extra accounts, keys, or billing setup.
Proposed Solution
Add openrouter as a native STT provider alongside the existing ones. The configuration follows the same pattern as other providers:
stt:
enabled: trueprovider: openrouteropenrouter:
model: openai/whisper-1 # OpenRouter model IDapi_key: ${OPENROUTER_API_KEY} # auto-detected from env var
The implementation would:
Provider detection — Add openrouter to _get_provider() in tools/transcription_tools.py. Accept stt.provider: openrouter in config or auto-detect when OPENROUTER_API_KEY is set (similar to how xai uses XAI_API_KEY).
Client routing — _transcribe_openai() already does OpenAI-compatible transcription. The OpenRouter provider reuses this path with:
base_url: https://openrouter.ai/api/v1
api_key: $OPENROUTER_API_KEY
No extra HTTP headers needed (OpenRouter's attribution headers are optional for STT)
Model selection — Let users pick any Whisper-compatible model available on OpenRouter. Default to openai/whisper-1. Common options:
openai/whisper-1
openai/gpt-4o-mini-transcribe
openai/gpt-4o-transcribe
Auth — Use the existing OPENROUTER_API_KEY env var, no new env vars needed.
Auto-detection priority — openrouter should be checked after openai and xai in the auto-detect chain, since it's a catch-all that many users already have configured.
No new Python SDK dependency required — the existing openai SDK with a custom base_url handles OpenRouter's API natively.
Alternatives Considered
Use OpenRouter as a custom OpenAI endpoint — Set stt.provider: openai with stt.openai.base_url: https://openrouter.ai/api/v1 and stt.openai.api_key: $OPENROUTER_API_KEY. This would work in theory but is blocked by existing bug bug(tools): stt.openai.api_key and base_url from config are ignored — STT provider always MISSING for local endpoints #4102 (config values for stt.openai.api_key and base_url are ignored — only env vars are read). So users can't do this today without setting VOICE_TOOLS_OPENAI_KEY to their OpenRouter key, which creates confusing env var semantics.
Environment variable workaround — Users could set VOICE_TOOLS_OPENAI_KEY=$OPENROUTER_API_KEY and STT_OPENAI_BASE_URL=https://openrouter.ai/api/v1, but this is undocumented, confusing, and loses the model routing flexibility OpenRouter provides.
Problem or Use Case
Hermes today supports five STT providers: local (faster-whisper), Groq, OpenAI, Mistral (Voxtral), and xAI (Grok). Users who already have
OPENROUTER_API_KEYset up for LLM access cannot use the same key & auth for speech-to-text — they'd need a separate Groq, OpenAI, or xAI account to transcribe voice messages.OpenRouter offers an OpenAI-compatible
/api/v1/audio/transcriptionsendpoint that routes to Whisper-family models (openai/whisper-1,openai/gpt-4o-mini-transcribe, etc.), billed through the same OpenRouter credit pool. Addingopenrouteras an STT provider means users with an OpenRouter key get voice transcription with zero extra accounts, keys, or billing setup.Proposed Solution
Add
openrouteras a native STT provider alongside the existing ones. The configuration follows the same pattern as other providers:The implementation would:
Provider detection — Add
openrouterto_get_provider()intools/transcription_tools.py. Acceptstt.provider: openrouterin config or auto-detect whenOPENROUTER_API_KEYis set (similar to howxaiusesXAI_API_KEY).Client routing —
_transcribe_openai()already does OpenAI-compatible transcription. The OpenRouter provider reuses this path with:base_url: https://openrouter.ai/api/v1api_key: $OPENROUTER_API_KEYModel selection — Let users pick any Whisper-compatible model available on OpenRouter. Default to
openai/whisper-1. Common options:openai/whisper-1openai/gpt-4o-mini-transcribeopenai/gpt-4o-transcribeAuth — Use the existing
OPENROUTER_API_KEYenv var, no new env vars needed.Auto-detection priority —
openroutershould be checked afteropenaiandxaiin the auto-detect chain, since it's a catch-all that many users already have configured.No new Python SDK dependency required — the existing
openaiSDK with a custombase_urlhandles OpenRouter's API natively.Alternatives Considered
Use OpenRouter as a custom OpenAI endpoint — Set
stt.provider: openaiwithstt.openai.base_url: https://openrouter.ai/api/v1andstt.openai.api_key: $OPENROUTER_API_KEY. This would work in theory but is blocked by existing bug bug(tools): stt.openai.api_key and base_url from config are ignored — STT provider always MISSING for local endpoints #4102 (config values forstt.openai.api_keyandbase_urlare ignored — only env vars are read). So users can't do this today without settingVOICE_TOOLS_OPENAI_KEYto their OpenRouter key, which creates confusing env var semantics.Environment variable workaround — Users could set
VOICE_TOOLS_OPENAI_KEY=$OPENROUTER_API_KEYandSTT_OPENAI_BASE_URL=https://openrouter.ai/api/v1, but this is undocumented, confusing, and loses the model routing flexibility OpenRouter provides.Wait for bug(tools): stt.openai.api_key and base_url from config are ignored — STT provider always MISSING for local endpoints #4102 fix — Bug bug(tools): stt.openai.api_key and base_url from config are ignored — STT provider always MISSING for local endpoints #4102 fixing the config plumbing would unblock the workaround, but a first-class
openrouterprovider is cleaner: dedicated model config, documented model options, consistent pattern with the existing OpenRouter TTS provider ([Feature]: Add OpenRouter as a TTS provider #15726).Related Issues
stt.openai.api_keyandbase_urlfrom config are ignored — this feature depends on the same config plumbing being fixed for OpenRouter auth.Feature Type: New tool (STT provider)
Scope: Medium (few files, < 300 lines — primarily
tools/transcription_tools.pychanges + config schema)