Bug Description
When Hermes uses Edge TTS, audio is generated natively as MP3 but is then unconditionally converted to OGG/Opus after generation. On macOS CLI this causes broken playback via afplay — the returned file is .ogg instead of .mp3, and OGG/Opus output sounds cut off or distorted locally.
Key facts:
- Edge TTS already outputs MP3 directly (
await communicate.save(output_path)).
- Even when passing an explicit
.mp3 output path, Hermes returns an .ogg file.
- The OGG conversion is only needed for Telegram voice-message delivery, but it fires for all platforms.
- WAV/MP3 playback works reliably on macOS; OGG does not.
Steps to Reproduce
- Set
tts.provider: edge in config.yaml.
- Call
text_to_speech in CLI mode with any text (e.g., "Hola, cómo estás").
- Optionally pass an explicit
.mp3 output path.
- Observe the returned
file_path ends in .ogg, not .mp3.
- Play with
afplay: afplay /path/to/file.ogg — audio is cut off or broken.
Expected Behavior
Hermes should preserve the native MP3 output for CLI/local mode and only convert to OGG when delivering to Telegram (or another platform requiring Opus).
Actual Behavior
MP3→OGG conversion happens unconditionally after generation, breaking local playback.
Affected Component
CLI (interactive chat), Other
Messaging Platform (if gateway-related)
N/A (CLI only)
Debug Report
Report https://paste.rs/83g8S
agent.log https://paste.rs/jFcWk
gateway.log https://paste.rs/MLCE3
Operating System
macOS 15.7.7
Python Version
3.14.5
Hermes Version
Hermes Agent v0.13.0 (2026.5.7)
Additional Logs / Traceback (optional)
Root Cause Analysis (optional)
In tools/tts_tool.py, the post-generation conversion block (~line 1843) converts Edge/NeuTTS/MiniMax/xAI/KittenTTS/Piper output to OGG regardless of whether the platform needs Opus:
elif provider in {"edge", "neutts", "minimax", "xai", "kittentts", "piper"} and not file_str.endswith(".ogg"):
opus_path = _convert_to_opus(file_str)
if opus_path:
file_str = opus_path
voice_compatible = True
This should be gated by want_opus (which is already computed from platform == "telegram" earlier in the function) so conversion only happens when Telegram delivery is active.
Proposed Fix (optional)
No response
Are you willing to submit a PR for this?
Bug Description
When Hermes uses Edge TTS, audio is generated natively as MP3 but is then unconditionally converted to OGG/Opus after generation. On macOS CLI this causes broken playback via
afplay— the returned file is.ogginstead of.mp3, and OGG/Opus output sounds cut off or distorted locally.Key facts:
await communicate.save(output_path))..mp3output path, Hermes returns an.oggfile.Steps to Reproduce
tts.provider: edgeinconfig.yaml.text_to_speechin CLI mode with any text (e.g., "Hola, cómo estás")..mp3output path.file_pathends in.ogg, not.mp3.afplay:afplay /path/to/file.ogg— audio is cut off or broken.Expected Behavior
Hermes should preserve the native MP3 output for CLI/local mode and only convert to OGG when delivering to Telegram (or another platform requiring Opus).
Actual Behavior
MP3→OGG conversion happens unconditionally after generation, breaking local playback.
Affected Component
CLI (interactive chat), Other
Messaging Platform (if gateway-related)
N/A (CLI only)
Debug Report
Operating System
macOS 15.7.7
Python Version
3.14.5
Hermes Version
Hermes Agent v0.13.0 (2026.5.7)
Additional Logs / Traceback (optional)
Root Cause Analysis (optional)
In
tools/tts_tool.py, the post-generation conversion block (~line 1843) converts Edge/NeuTTS/MiniMax/xAI/KittenTTS/Piper output to OGG regardless of whether the platform needs Opus:This should be gated by
want_opus(which is already computed fromplatform == "telegram"earlier in the function) so conversion only happens when Telegram delivery is active.Proposed Fix (optional)
No response
Are you willing to submit a PR for this?