-
-
Notifications
You must be signed in to change notification settings - Fork 52.6k
Closed
Description
Bug Description
The voice-call plugin's embedded agent runs include the gateway's built-in tts tool in the available toolset. The LLM consistently calls tts instead of returning plain text, even when the responseSystemPrompt explicitly says "Never call the tts tool" and "Always answer with plain spoken text."
This results in completely silent voice calls — the caller hears the initial greeting (static TTS) but never gets an AI response. The embedded run completes successfully (isError=false) but the voice-call plugin has no plain text to route through ElevenLabs telephony TTS.
Reproduction
- Configure voice-call plugin with
responseModel(Sonnet or Haiku) - Set
agents.defaults.thinkingDefault: "high"(common for coding agents) - Call the Twilio number
- Speak after the greeting
- Wait — no response, call eventually times out
Logs
Every call shows the same pattern:
embedded run start: provider=anthropic model=claude-sonnet-4-6 thinking=high messageChannel=voice
embedded run tool start: tool=tts ← model calls tts instead of text
embedded run tool end: tool=tts ← completes in ~200ms
embedded run agent end: isError=false ← "succeeds" with no spoken output
No [voice-call] AI response: line is ever logged because the response is a tool call, not text.
What doesn't work
tools.deny: ["tts"]on the voice agent — accepted by schema, dynamically reloaded, but the voice-call plugin's embedded runs bypass the agent's tool config- Switching models (Haiku, Sonnet) — both call
tts - Resetting/purging voice session — happens on fresh sessions too
- Changing
thinkingDefault— model callsttsat every thinking level - System prompt already says "Never call the
ttstool" — model ignores it
Expected Behavior
The voice-call plugin should either:
- Strip the
ttstool from embedded run toolsets (preferred — voice calls should never use gateway TTS) - Intercept
ttstool calls and route them through the telephony TTS pipeline (ElevenLabs/etc) - Respect per-agent
tools.denyin embedded runs
Environment
- OpenClaw v2026.2.24 (stable)
- macOS, LaunchAgent-managed gateway
- Voice pipeline: Twilio → OpenAI Realtime STT → LLM → ElevenLabs TTS
- Tested with both
anthropic/claude-sonnet-4-6andanthropic/claude-haiku-4-5
Related
- feat(voice-call): per-agent thinkingDefault override for voice calls #27020 (per-agent thinkingDefault override)
- Discussion Reducing /v1/chat/completions latency for real-time voice agents #10588 (voice agent latency)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels