Skip to content

Follow-up: Pluggable STT providers and auto-disable when no API key #1166

@AxDSan

Description

@AxDSan

Background

This is a follow-up to #1100 and PR #1110. The initial fix implemented honoring stt.enabled: false in config, but there are additional improvements needed for a more robust STT (Speech-to-Text) configuration experience.

Current State

PR #1110 added:

  • Basic support for stt.enabled: false to skip transcription entirely
  • Default behavior unchanged (STT enabled by default)

Proposed Improvements

1. Pluggable STT Providers

Currently, the gateway hardcodes OpenAI Whisper as the STT provider. Users should be able to choose their preferred provider:

Desired providers:

  • OpenAI Whisper (current default)
  • Deepgram
  • Local Whisper (run locally without external API calls)

Proposed config format:

stt:
  enabled: true
  provider: deepgram  # Options: openai, deepgram, local
  openai:
    api_key: ${OPENAI_API_KEY}
    model: whisper-1
  deepgram:
    api_key: ${DEEPGRAM_API_KEY}
    model: nova-2
  local:
    model_path: /path/to/whisper/model
    device: cuda  # or cpu

2. Auto-disable When No API Key is Present

The gateway should gracefully handle missing API keys by:

  • Detecting when STT is enabled but no valid API key is configured for the selected provider
  • Logging a warning message explaining the situation
  • Auto-disabling STT for that session (or permanently until fixed)
  • Continuing to operate normally for text-based messages

This prevents the 401 errors described in #1100 when users don't have API keys configured.

3. Runtime Provider Selection (Optional Future)

Consider allowing per-message provider selection via command, e.g.:
/stt use deepgram
/stt use local

Acceptance Criteria

  • Config supports stt.provider field with multiple provider options
  • Deepgram STT integration implemented
  • Local Whisper integration implemented
  • Auto-disable logic when API key is missing/absent/invalid
  • Clear warning logs when STT is auto-disabled
  • Backward compatibility maintained (existing configs still work)
  • Tests covering all new provider paths and auto-disable scenarios

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions