feat(stt): add OpenRouter speech-to-text provider by RemyFevry · Pull Request #25721 · NousResearch/hermes-agent

RemyFevry · 2026-05-14T13:35:06Z

Summary

Adds openrouter as STT provider via OpenRouter /api/v1/audio/transcriptions.

Benchmark (68KB OGG voice message)

Model	Time	Cost	Quality
`whisper-large-v3-turbo`	0.49s	$0.0004	Good
`gpt-4o-mini-transcribe`	1.81s	~$0	Best
`google/chirp-3`	1.81s	$0.0088	Poor
`whisper-1`	2.45s	~$0	Poor
local (base)	3.64s	$0 (CPU)	Slowest

turbo is 7x faster than local with better accuracy.

Changes

tools/transcription_tools.py — _transcribe_openrouter(), constants, dispatch
hermes_cli/config.py — openrouter config section
hermes_cli/web_server.py — "openrouter" option

Testing

Live transcription
5-model benchmark
Auto-detection
Error handling

Adds openrouter as a seventh STT provider to transcription_tools.py. Uses the OpenRouter /api/v1/audio/transcriptions endpoint with base64-encoded JSON payloads. Supports all whisper models routed through OpenRouter (openai/whisper-1, groq/whisper-large-v3-turbo, etc.) with unified billing through existing OPENROUTER_API_KEY. Changes: - tools/transcription_tools.py: add _transcribe_openrouter(), constants, provider selection, and auto-detect fallback - hermes_cli/config.py: add openrouter to default config template - hermes_cli/web_server.py: add openrouter to stt.provider options

alt-glitch · 2026-05-14T13:55:45Z

Duplicate of #21799 which adds OpenRouter as both transcription + speech provider with 19 tests. Also competes with #24703. Feature requested in #24415.

16 tests covering _transcribe_openrouter, _get_provider, and transcribe_audio dispatch. Follows existing xAI test patterns. Tests: key handling, success, whitespace stripping, API errors, empty transcripts, permission errors, network errors, JSON body verification, custom base URL, auto-detect priority, and model override passthrough.

liuhao1024

Bug: _transcribe_openrouter uses JSON body instead of multipart/form-data

The function sends audio as JSON with base64-encoded input_audio, but OpenRouter's /audio/transcriptions endpoint follows the OpenAI Whisper API format, which expects multipart/form-data with a file upload. This will likely return a 400/422 at runtime.

Every other provider in this file (OpenAI, Groq, xAI) uses multipart/form-data. Compare with the xAI implementation ~40 lines above:

# xAI (correct — multipart/form-data)
with open(file_path, "rb") as audio_file:
    response = requests.post(
        f"{base_url}/stt",
        headers={"Authorization": f"Bearer {api_key}"},
        files={"file": (Path(file_path).name, audio_file)},
        data={"model": model_name},
        timeout=120,
    )

The OpenRouter implementation should use the same pattern instead of json={"input_audio": {"data": b64, ...}}:

with open(file_path, "rb") as audio_file:
    response = requests.post(
        f"{base_url}/audio/transcriptions",
        headers={"Authorization": f"Bearer {api_key}"},
        files={"file": (Path(file_path).name, audio_file)},
        data={"model": model_name},
        timeout=120,
    )

Minor: OPENROUTER_STT_MODELS is defined but never used

The OPENROUTER_STT_MODELS set (~line 97-101) is never referenced for model validation. Either remove it or add a guard in _transcribe_openrouter that warns when the model isn't in the set (like the OpenAI provider does with GROQ_MODELS).

RemyFevry mentioned this pull request May 14, 2026

Feature Request: Add OpenRouter as STT provider #25722

Open

alt-glitch added type/feature New feature or request P3 Low — cosmetic, nice to have tool/tts Text-to-speech and transcription provider/openrouter OpenRouter aggregator comp/cli CLI entry point, hermes_cli/, setup wizard duplicate This issue or pull request already exists labels May 14, 2026

liuhao1024 reviewed May 14, 2026

View reviewed changes

alt-glitch mentioned this pull request May 19, 2026

feat: add OpenRouter as STT transcription provider #28848

Open

kshitijk4poor mentioned this pull request May 22, 2026

feat(stt): add register_transcription_provider() plugin hook #30493

Closed

7 tasks

teknium1 mentioned this pull request May 25, 2026

feat(stt): add register_transcription_provider() hook + stt.providers command-provider registry (salvage of #30493) #31907

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(stt): add OpenRouter speech-to-text provider#25721

feat(stt): add OpenRouter speech-to-text provider#25721
RemyFevry wants to merge 2 commits into
NousResearch:mainfrom
RemyFevry:feat/openrouter-stt-provider

RemyFevry commented May 14, 2026 •

edited

Loading

Uh oh!

alt-glitch commented May 14, 2026

Uh oh!

liuhao1024 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

RemyFevry commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmark (68KB OGG voice message)

Changes

Testing

Uh oh!

alt-glitch commented May 14, 2026

Uh oh!

liuhao1024 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RemyFevry commented May 14, 2026 •

edited

Loading