Skip to content

Add Deepgram transcription support#712

Closed
safzanpirani wants to merge 1 commit into
openclaw:mainfrom
safzanpirani:feat/deepgram-transcription
Closed

Add Deepgram transcription support#712
safzanpirani wants to merge 1 commit into
openclaw:mainfrom
safzanpirani:feat/deepgram-transcription

Conversation

@safzanpirani

Copy link
Copy Markdown
Contributor

Adds Deepgram as a native transcription provider for voice messages.

Usage:

{
  "tools": {
    "audio": {
      "transcription": {
        "provider": "deepgram",
        "deepgram": {
          "model": "nova-3",
          "language": "en"
        }
      }
    }
  }
}

Or just set DEEPGRAM_API_KEY env var.

Tested with Telegram voice messages.

@safzanpirani safzanpirani force-pushed the feat/deepgram-transcription branch 2 times, most recently from 692df88 to c399444 Compare January 15, 2026 14:35
Adds Deepgram as a first-class transcription provider alongside the existing
command-based transcription.

Changes:
- Add @deepgram/sdk dependency
- Add TranscriptionProvider type (command | deepgram | openai)
- Add DeepgramTranscriptionConfig type and zod schema
- Create transcription-deepgram.ts with Deepgram SDK integration
- Update transcription.ts to route by provider
- Support DEEPGRAM_API_KEY env var as fallback

Config example:
  tools.audio.transcription.provider: "deepgram"
  tools.audio.transcription.deepgram.model: "nova-3"
  tools.audio.transcription.deepgram.language: "multi"
@safzanpirani safzanpirani force-pushed the feat/deepgram-transcription branch from c399444 to 27fd863 Compare January 17, 2026 07:14
Comment thread package.json
"dependencies": {
"@buape/carbon": "0.0.0-beta-20260110172854",
"@clack/prompts": "^0.11.0",
"@deepgram/sdk": "^4.11.3",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is 1MB again - I'll change this toa simple HTTP request.

@steipete

Copy link
Copy Markdown
Contributor

Landed on main via: e637bbd, ae67925, 77f7310e0.

Summary:

  • Added Deepgram audio transcription via REST provider (no SDK).
  • Added providerOptions/baseUrl/headers + Deepgram option mapping, plus runner coverage test.
  • Added Deepgram docs + live test guidance.

Tests:

  • pnpm test src/media-understanding/providers/deepgram/audio.test.ts
  • pnpm test src/media-understanding/runner.deepgram.test.ts src/media-understanding/providers/deepgram/audio.test.ts
  • DEEPGRAM_LIVE_TEST=1 pnpm test:live src/media-understanding/providers/deepgram/audio.live.test.ts
  • pnpm docs:list

@steipete steipete closed this Jan 17, 2026
@steipete

Copy link
Copy Markdown
Contributor

Correction: commit 77f7310e0 (refactor: unify media provider options) is currently local-only (main is ahead by 1). Happy to push if you want it on origin/main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants