-
-
Notifications
You must be signed in to change notification settings - Fork 57.6k
Closed
kevinWangSheng/openclaw
#30Description
Description
Discord voice messages generated from a 24kHz TTS source play back at roughly 0.5x speed (noticeably slow). The same MP3 file sent as a regular attachment plays at normal speed.
Environment
- OpenClaw latest (npm)
- TTS provider: OpenAI-compatible (mlx-audio server)
- TTS model:
mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-8bit - Source audio: MP3, 24000 Hz, mono, 128 kbps
- Channel: Discord
Steps to Reproduce
- Configure TTS with an OpenAI-compatible server that outputs 24kHz MP3 (e.g. mlx-audio)
- Use the
ttstool to generate a voice message - Listen to the Discord voice message — it plays at ~0.5x speed
Expected
Voice message plays at normal speed (1x).
Actual
Voice message plays at ~0.5x speed. The audio sounds slowed down.
Analysis
The TTS server outputs MP3 at 24000 Hz sample rate. When OpenClaw converts this to a Discord voice message (opus encoding), it appears to assume 48kHz input, causing the 0.5x playback speed. Sending the same MP3 as a regular file attachment plays correctly.
ffprobe output of TTS source:
Stream #0:0: Audio: mp3 (mp3float), 24000 Hz, mono, fltp, 128 kb/s
Workaround
Resampling the audio to 48kHz before passing to OpenClaw would likely fix it, but this should be handled internally.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels