Description
When using Talk Mode on the macOS companion app with a non-ElevenLabs TTS provider (e.g. system), two problems occur:
- Double playback: Every assistant reply is spoken twice back-to-back
- Premature cutoff for CJK languages: The watchdog timer uses a flat 0.08s/char estimate that is too short for Korean, Chinese, and Japanese text, causing speech to be killed mid-sentence
Environment
- OpenClaw: 2026.3.13
- macOS: Darwin 25.2.0 (Mac mini)
- Talk provider:
system (also reproduced with openai)
Steps to Reproduce
Double playback
- Set
talk.provider to system (or any non-ElevenLabs provider)
- Enable Talk Mode in the Mac app
- Speak a message and wait for the assistant reply
- Result: Audio plays twice back-to-back
CJK cutoff
- Set Talk Mode language to Korean (
ko-KR)
- Ask a question that produces a 50+ character response
- Result: Speech stops mid-sentence after ~4 seconds (50 × 0.08 = 4s watchdog)
Root Cause
Double playback
In TalkModeRuntime.playAssistant(), the error handling always falls back to playSystemVoice() regardless of which TTS provider failed:
do {
if apiKey != nil && voiceId != nil {
try await self.playElevenLabs(...)
} else {
try await self.playSystemVoice(...) // 1st play
}
} catch {
// Always falls back to system voice, even when system voice itself failed
try await self.playSystemVoice(...) // 2nd play (duplicate!)
}
CJK cutoff
The watchdog uses 0.08s/char for all languages, but CJK characters represent full syllables:
| Language |
Syllables/sec (research) |
Chars/syllable |
Actual time/char |
| English |
6.19 SPS |
~5 |
~0.08s |
| Korean |
5.96 SPS |
1 |
~0.25s |
| Chinese |
5.18 SPS |
1 |
~0.28s |
| Japanese |
7.84 SPS |
~1.5 (mixed) |
~0.20s |
Source: Pellegrino et al., Science Advances (2019)
Fix
- Only fall back to system voice when ElevenLabs fails — not when system voice itself fails
- Use language-specific per-character estimates with 3x safety margin for the watchdog
Related Issues
Description
When using Talk Mode on the macOS companion app with a non-ElevenLabs TTS provider (e.g.
system), two problems occur:Environment
system(also reproduced withopenai)Steps to Reproduce
Double playback
talk.providertosystem(or any non-ElevenLabs provider)CJK cutoff
ko-KR)Root Cause
Double playback
In
TalkModeRuntime.playAssistant(), the error handling always falls back toplaySystemVoice()regardless of which TTS provider failed:CJK cutoff
The watchdog uses
0.08s/charfor all languages, but CJK characters represent full syllables:Source: Pellegrino et al., Science Advances (2019)
Fix
Related Issues