feat(audio): auto-echo transcription to chat before agent processing#32150
feat(audio): auto-echo transcription to chat before agent processing#32150steipete merged 2 commits intoopenclaw:mainfrom
Conversation
When echoTranscript is enabled in tools.media.audio config, the
transcription text is sent back to the originating chat immediately
after successful audio transcription — before the agent processes it.
This lets users verify what was heard from their voice note.
Changes:
- config/types.tools.ts: add echoTranscript (bool) and echoFormat
(string template) to MediaUnderstandingConfig
- media-understanding/apply.ts: sendTranscriptEcho() helper that
resolves channel/to from ctx, guards on isDeliverableMessageChannel,
and calls deliverOutboundPayloads best-effort
- config/schema.help.ts: help text for both new fields
- config/schema.labels.ts: labels for both new fields
- media-understanding/apply.echo-transcript.test.ts: 10 vitest cases
covering disabled/enabled/custom-format/no-audio/failed-transcription/
non-deliverable-channel/missing-from/OriginatingTo/delivery-failure
Default echoFormat: '📝 "{transcript}"'
Closes openclaw#32102
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ad765098d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| echoTranscript?: boolean; | ||
| /** | ||
| * Format string for the echoed transcript. Use `{transcript}` as placeholder. | ||
| * Default: '📝 "{transcript}"' | ||
| */ |
There was a problem hiding this comment.
Add audio echo options to validated config schema
The new tools.media.audio.echoTranscript / echoFormat settings are declared and documented here, but they were not added to the runtime Zod schema (src/config/zod-schema.core.ts, ToolsMediaUnderstandingSchema at lines 674-685), which is .strict(). As a result, any config that follows this commit’s docs and sets these keys will fail validation as unrecognized fields, so the feature cannot be enabled through normal config loading.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR adds a new optional feature to echo audio transcriptions back to the chat before agent processing, allowing users to verify what was heard from their voice messages. Key changes:
Implementation quality:
The implementation is clean, well-tested, and handles edge cases appropriately. No issues found. Confidence Score: 5/5
Last reviewed commit: ad76509 |
|
Landed via temp rebase onto main.
Thanks @AytuncYildizli! |
Summary
When a user sends a voice message, the transcription is now echoed back to the chat before the agent processes it — so users can immediately verify what was heard.
What changed
Config (
tools.media.audio)Two new optional fields on
MediaUnderstandingConfig:{ "tools": { "media": { "audio": { "echoTranscript": true, // default: false "echoFormat": "🎙️ Heard: {transcript}" // default: '📝 "{transcript}"' } } } }Echo logic (
src/media-understanding/apply.ts)After transcription succeeds and
ctx.Transcriptis set, ifechoTranscript: true:ctx.Provider/ctx.Surfaceandctx.OriginatingTo/ctx.FromisDeliverableMessageChannel— silently skips for internal/non-deliverable channelsdeliverOutboundPayloadswithbestEffort: true— delivery failure never blocks transcriptionechoFormattemplate ({transcript}placeholder)Schema docs
src/config/schema.help.ts— help text for both new fieldssrc/config/schema.labels.ts— UI labels for both new fieldsTests (
src/media-understanding/apply.echo-transcript.test.ts)10 test cases:
echoTranscript: false(explicit)echoTranscriptabsent (default)echoTranscript: trueechoFormatFrom/OriginatingToOriginatingTopresent,FromabsentOriginatingToChecklist
anybun test).jsextensionCloses #32102