-
-
Notifications
You must be signed in to change notification settings - Fork 52.7k
Open
Description
Bug
Voice messages (OGG/Opus) sent via Telegram arrive as raw audio attachments and are NOT auto-transcribed, despite having tools.media.audio properly configured.
Config
{
"tools": {
"media": {
"audio": {
"enabled": true,
"models": [
{ "provider": "openai", "model": "gpt-4o-mini-transcribe" }
]
}
}
}
}Verified via openclaw config get tools.media — config is loaded correctly.
Environment
- OS: Windows 10 (10.0.19045) x64
- OpenClaw: 2026.2.19-2
- Node: v24.13.0
- Channel: Telegram (bot, working correctly for text messages)
- OpenAI API key: Valid and working (manual Whisper API transcription succeeds with the same key)
Steps to Reproduce
- Set
tools.media.audio.enabled: truewith OpenAI provider model - Restart gateway (
openclaw gateway restart) - Send a voice message via Telegram to the bot
- Voice message arrives as
<media:audio>with raw OGG file attachment - No transcription occurs —
Bodyis not replaced with transcript
Expected Behavior
Voice message should be auto-transcribed and the transcript should replace the message body, per the docs at https://docs.openclaw.ai/nodes/audio
Workaround
Manual transcription via ffmpeg + OpenAI Whisper API works fine:
ffmpeg -y -i input.ogg output.wav
# Then POST to /v1/audio/transcriptions with whisper-1
Related
Possibly related to #7899 (Telegram voice messages not transcribed - applyMediaUnderstanding not called)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels