Skip to content

feat(stt): add SenseAudio STT provider#9380

Open
Fl0rencess720 wants to merge 2 commits into
NousResearch:mainfrom
Fl0rencess720:feat/stt-senseaudio-provider
Open

feat(stt): add SenseAudio STT provider#9380
Fl0rencess720 wants to merge 2 commits into
NousResearch:mainfrom
Fl0rencess720:feat/stt-senseaudio-provider

Conversation

@Fl0rencess720

Copy link
Copy Markdown

Why

Hermes already had a flexible multi-provider speech-to-text pipeline, but it did not yet support SenseAudio as an STT backend.

This branch adds SenseAudio as an additional provider within the existing transcription_tools.py dispatcher so it can participate in the same provider selection, config, and fallback flow as the current local, Groq, OpenAI, and Mistral implementations.

Summary

1) Add SenseAudio as a first-class STT provider

  • Extends tools/transcription_tools.py with a new senseaudio provider path
  • Adds SenseAudio-specific defaults and constants:
    • DEFAULT_SENSEAUDIO_STT_MODEL
    • SENSEAUDIO_BASE_URL
    • SENSEAUDIO_MODELS
  • Uses the current SenseAudio STT API through the existing OpenAI-compatible client path

2) Wire SenseAudio into provider selection

  • Updates _get_provider() so an explicitly configured stt.provider: senseaudio is respected
  • Adds validation for SenseAudio availability:
    • openai package must be installed
    • SENSEAUDIO_API_KEY must be set
  • Extends auto-detection so SenseAudio can be selected when higher-priority STT providers are unavailable

3) Add provider-specific transcription implementation

  • Introduces _transcribe_senseaudio(file_path, model_name)
  • Loads provider-specific config from the existing STT config structure
  • Creates a provider-scoped OpenAI-compatible client with SenseAudio base URL and API key
  • Returns the same normalized result shape used by the other STT providers

4) Keep the existing STT architecture and UX consistent

  • Integrates SenseAudio into transcribe_audio() using the same dispatcher pattern as the other providers
  • Auto-corrects unsupported model names to the default SenseAudio STT model
  • Updates the “no provider available” error message to include SenseAudio as another supported option

Safety and Regression Notes

  • No new tool surface was introduced; this change extends the existing STT tool path only.
  • The implementation stays within the current single-file multi-provider STT architecture.
  • SenseAudio is added as another provider option rather than changing the behavior of existing providers.
  • Regression risk is limited because the dispatcher structure and return contract remain unchanged.

Files Changed

Updated:

  • tools/transcription_tools.py

Test Evidence

Automated tests

Ran the existing STT tool test suite against the branch:

  • tests/tools/test_transcription_tools.py

Result: 71 passed in 13.38s

Manual smoke tests

screenshot2026-04-14 12 23 42pm

- Add SenseAudio as a new OpenAI-compatible STT provider
- Uses openai SDK with SENSEAUDIO_BASE_URL as base_url
- Supports SENSEAUDIO_API_KEY env var and stt.senseaudio config section
- Default model: senseaudio-asr-1.5-260319
- Includes SENSEAUDIO_MODELS set for auto-correction
- Adds auto-detect fallback after mistral in provider chain
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 Low — cosmetic, nice to have tool/tts Text-to-speech and transcription type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants