Add runtime.stt.transcribeAudioFile for plugin STT access#22402
Merged
steipete merged 1 commit intoopenclaw:mainfrom Mar 2, 2026
Merged
Add runtime.stt.transcribeAudioFile for plugin STT access#22402steipete merged 1 commit intoopenclaw:mainfrom
steipete merged 1 commit intoopenclaw:mainfrom
Conversation
09ddf92 to
70009ce
Compare
70009ce to
a30e756
Compare
Contributor
Author
|
CI |
a30e756 to
334a8e8
Compare
|
This pull request has been automatically marked as stale due to inactivity. |
Expose audio transcription through the PluginRuntime so external
plugins (e.g. marmot) can use openclaw's media-understanding provider
framework without importing unexported internal modules.
The new transcribeAudioFile() wraps runCapability({capability: "audio"})
and reads provider/model/apiKey from tools.media.audio in the config,
matching the pattern used by the Discord VC implementation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
334a8e8 to
df34b16
Compare
Contributor
|
Landed via temp rebase onto main.
Thanks @benthecarman! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
runtime.stt.transcribeAudioFile()toPluginRuntimeso external plugins can use openclaw's media-understanding provider framework for speech-to-textsrc/media-understanding/transcribe-audio.tswrapsrunCapability({capability: "audio"})— same pattern as the Discord VC implementation in Discord: VC Support #18774tools.media.audioin the config, with automatic provider fallbackMotivation
The marmot plugin needs to transcribe call audio chunks but can't import internal media-understanding modules (
ERR_PACKAGE_PATH_NOT_EXPORTED). This mirrors howruntime.tts.textToSpeechTelephonyalready exposes TTS to plugins.Usage (from a plugin)
Test plan
runtime.stt.transcribeAudioFile()after openclaw is rebuilt🤖 Generated with Claude Code
Greptile Summary
Adds
runtime.stt.transcribeAudioFile()to expose speech-to-text functionality to external plugins. The implementation follows the same pattern as the Discord voice manager'stranscribeAudio()function, wrappingrunCapability({capability: "audio"})from the media-understanding framework.Key changes:
src/media-understanding/transcribe-audio.tsprovides a standalone wrapper functionPluginRuntime.stt.transcribeAudioFiletools.media.audioconfigcache.cleanup()in finally blockThe implementation is clean and matches established patterns in the codebase.
Confidence Score: 5/5
Last reviewed commit: 70009ce
(2/5) Greptile learns from your feedback when you react with thumbs up/down!