Add runtime.stt.transcribeAudioFile for plugin STT access by benthecarman · Pull Request #22402 · openclaw/openclaw

benthecarman · 2026-02-21T03:53:21Z

Summary

Add runtime.stt.transcribeAudioFile() to PluginRuntime so external plugins can use openclaw's media-understanding provider framework for speech-to-text
New src/media-understanding/transcribe-audio.ts wraps runCapability({capability: "audio"}) — same pattern as the Discord VC implementation in Discord: VC Support #18774
Reads provider/model/apiKey from tools.media.audio in the config, with automatic provider fallback

Motivation

The marmot plugin needs to transcribe call audio chunks but can't import internal media-understanding modules (ERR_PACKAGE_PATH_NOT_EXPORTED). This mirrors how runtime.tts.textToSpeechTelephony already exposes TTS to plugins.

Usage (from a plugin)

const result = await runtime.stt.transcribeAudioFile({
  filePath: "/tmp/audio-chunk.wav",
  cfg: runtime.config.loadConfig(),
});
if (result.text) {
  // dispatch transcript to agent
}

Test plan

TypeScript compiles
Existing media-understanding tests still pass
Marmot plugin can call runtime.stt.transcribeAudioFile() after openclaw is rebuilt

🤖 Generated with Claude Code

Greptile Summary

Adds runtime.stt.transcribeAudioFile() to expose speech-to-text functionality to external plugins. The implementation follows the same pattern as the Discord voice manager's transcribeAudio() function, wrapping runCapability({capability: "audio"}) from the media-understanding framework.

Key changes:

New src/media-understanding/transcribe-audio.ts provides a standalone wrapper function
Function exported via PluginRuntime.stt.transcribeAudioFile
Uses same provider/model/apiKey resolution from tools.media.audio config
Properly handles cleanup via cache.cleanup() in finally block

The implementation is clean and matches established patterns in the codebase.

Confidence Score: 5/5

Safe to merge - straightforward implementation following existing patterns
The implementation directly mirrors the proven Discord voice manager pattern, properly handles resource cleanup, and uses the existing media-understanding provider framework without introducing new dependencies or risks. The only minor suggestion is around MIME type flexibility.
No files require special attention

_{Last reviewed commit: 70009ce}

_{(2/5) Greptile learns from your feedback when you react with thumbs up/down!}

greptile-apps

_{3 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

src/media-understanding/transcribe-audio.ts

benthecarman · 2026-02-21T04:03:16Z

CI check failure is pre-existing on main — the memory/manager.async-search.test.ts and memory/qmd-manager.test.ts type errors are not from this PR. All recent main CI runs show the same failures.

openclaw-barnacle · 2026-02-28T04:04:00Z

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

Expose audio transcription through the PluginRuntime so external plugins (e.g. marmot) can use openclaw's media-understanding provider framework without importing unexported internal modules. The new transcribeAudioFile() wraps runCapability({capability: "audio"}) and reads provider/model/apiKey from tools.media.audio in the config, matching the pattern used by the Discord VC implementation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

steipete · 2026-03-02T21:43:04Z

Landed via temp rebase onto main.

Gate: pnpm vitest src/media-understanding/transcribe-audio.test.ts src/plugins/runtime/index.test.ts extensions/bluebubbles/src/monitor.test.ts
Land commit: 98e4cb342e92f3ddbcb6e3d2663ea53a179e2208
Merge commit: faa4ffe

Thanks @benthecarman!

openclaw-barnacle bot added the size: S label Feb 21, 2026

benthecarman force-pushed the add-runtime-stt branch from 09ddf92 to 70009ce Compare February 21, 2026 03:55

greptile-apps bot reviewed Feb 21, 2026

View reviewed changes

src/media-understanding/transcribe-audio.ts Outdated Show resolved Hide resolved

benthecarman force-pushed the add-runtime-stt branch from 70009ce to a30e756 Compare February 21, 2026 04:00

openclaw-barnacle bot added the channel: bluebubbles Channel integration: bluebubbles label Feb 21, 2026

benthecarman force-pushed the add-runtime-stt branch from a30e756 to 334a8e8 Compare February 21, 2026 04:35

openclaw-barnacle bot added the stale Marked as stale due to inactivity label Feb 28, 2026

benthecarman force-pushed the add-runtime-stt branch from 334a8e8 to df34b16 Compare February 28, 2026 21:06

openclaw-barnacle bot removed the stale Marked as stale due to inactivity label Mar 1, 2026

steipete merged commit faa4ffe into openclaw:main Mar 2, 2026
25 of 26 checks passed

benthecarman deleted the add-runtime-stt branch March 3, 2026 00:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add runtime.stt.transcribeAudioFile for plugin STT access#22402

Add runtime.stt.transcribeAudioFile for plugin STT access#22402
steipete merged 1 commit intoopenclaw:mainfrom
benthecarman:add-runtime-stt

benthecarman commented Feb 21, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

greptile-apps bot left a comment

Uh oh!

Uh oh!

benthecarman commented Feb 21, 2026

Uh oh!

openclaw-barnacle bot commented Feb 28, 2026

Uh oh!

Uh oh!

steipete commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

benthecarman commented Feb 21, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Usage (from a plugin)

Test plan

Greptile Summary

Confidence Score: 5/5

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

benthecarman commented Feb 21, 2026

Uh oh!

openclaw-barnacle bot commented Feb 28, 2026

Uh oh!

Uh oh!

steipete commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

benthecarman commented Feb 21, 2026 •

edited by greptile-apps bot

Loading