When the tts tool runs, it returns a tool result containing [[audio_as_voice]]\nMEDIA:/path/to/audio.mp3. The delivery pipeline dispatches this correctly as media.
The problem is that the model then generates a follow-up assistant message that re-echoes the same MEDIA: path and adds text on top of it. The delivery pipeline processes both: the tool result's media dispatch AND the follow-up message. This results in the audio being sent twice + unwanted text alongside it.
Example from a session log
- Model calls
tts tool with text
- Tool result:
[[audio_as_voice]]\nMEDIA:/tmp/tts-.../voice-123.mp3 — dispatched as media ✓
- Model follow-up:
[[tts:rate=+5%,pitch=+2%]]\nGoodnight, my love...\n\nMEDIA:/tmp/tts-.../voice-123.mp3 — dispatched again as text + duplicate media
Root cause
Two things contribute:
-
The TTS tool description says "Copy the MEDIA line exactly" — this tells the model to re-output the path in its reply, but the delivery pipeline already sends the tool result's media independently.
-
filterMessagingToolDuplicates only deduplicates text content, not MEDIA: URLs. So even though the same file path appears in both the tool result and the follow-up, it's sent twice.
Suggested fixes
- Remove "Copy the MEDIA line exactly" from the TTS tool description, or change it to instruct the model to reply with
NO_REPLY after a successful TTS call
- Extend
filterMessagingToolDuplicates to also track and deduplicate MEDIA: paths across tool results and follow-up messages
Workaround
I added explicit instructions in TOOLS.md telling the model not to re-output MEDIA lines from TTS results. This helps but relies on model compliance — a runtime dedup would be more reliable.
Environment
- v2026.2.14 (c1feda1)
- kimi-coding/k2p5 model (but likely affects any model)
- Discord channel, Edge TTS
When the
ttstool runs, it returns a tool result containing[[audio_as_voice]]\nMEDIA:/path/to/audio.mp3. The delivery pipeline dispatches this correctly as media.The problem is that the model then generates a follow-up assistant message that re-echoes the same
MEDIA:path and adds text on top of it. The delivery pipeline processes both: the tool result's media dispatch AND the follow-up message. This results in the audio being sent twice + unwanted text alongside it.Example from a session log
ttstool with text[[audio_as_voice]]\nMEDIA:/tmp/tts-.../voice-123.mp3— dispatched as media ✓[[tts:rate=+5%,pitch=+2%]]\nGoodnight, my love...\n\nMEDIA:/tmp/tts-.../voice-123.mp3— dispatched again as text + duplicate mediaRoot cause
Two things contribute:
The TTS tool description says "Copy the MEDIA line exactly" — this tells the model to re-output the path in its reply, but the delivery pipeline already sends the tool result's media independently.
filterMessagingToolDuplicatesonly deduplicates text content, notMEDIA:URLs. So even though the same file path appears in both the tool result and the follow-up, it's sent twice.Suggested fixes
NO_REPLYafter a successful TTS callfilterMessagingToolDuplicatesto also track and deduplicateMEDIA:paths across tool results and follow-up messagesWorkaround
I added explicit instructions in TOOLS.md telling the model not to re-output MEDIA lines from TTS results. This helps but relies on model compliance — a runtime dedup would be more reliable.
Environment