Skip to content

WhatsApp adapter missing send_voice() override β€” voice messages sent as text pathΒ #9236

@kaan-escober

Description

@kaan-escober

Problem

The WhatsApp platform adapter does not have a `send_voice()` override, causing voice/audio messages to fall back to the base class implementation which sends them as plain text (e.g., `πŸ”Š Audio: /path/to/file.mp3`) instead of native media attachments.

Root Cause

`gateway/platforms/whatsapp.py` has overrides for `send_image`, `send_image_file`, `send_video`, and `send_document` β€” but no `send_voice`. When the base class (`gateway/platforms/base.py`) calls `self.send_voice()`, it falls back to the default implementation at line ~1113 which just returns a text string.

This affects two separate paths:

  1. Auto-TTS path β€” user sends a voice message β†’ base.py calls `self.play_tts()` β†’ calls `self.send_voice()` β†’ no override β†’ text fallback
  2. Explicit TTS tool path β€” model calls `text_to_speech` tool β†’ `run.py` extracts `MEDIA:` tag from tool result β†’ `extract_media()` routes to `self.send_voice()` β†’ same fallback

Every other platform adapter (Telegram, Discord, Signal, Slack, Matrix, BlueBubbles, WeCom, Weixin, Mattermost, Feishu) has this override. WhatsApp is the only one missing it.

Fix

Add `send_voice()` method to `gateway/platforms/whatsapp.py`:

```python
async def send_voice(
self,
chat_id: str,
audio_path: str,
caption: Optional[str] = None,
reply_to: Optional[str] = None,
**kwargs,
) -> SendResult:
"""Send audio file as native media via bridge."""
return await self._send_media_to_bridge(chat_id, audio_path, "audio", caption)
```

This routes through the existing `_send_media_to_bridge()` method with `media_type="audio"`. The bridge (`bridge.js` line ~468-471) already handles `"audio"` type β€” it reads the file and sends `sock.sendMessage({ audio: buffer, mimetype: "audio/mpeg" })`.

Note

The bridge sends `.mp3` files as regular audio attachments (music player style). Voice note bubbles (PTT) are only sent for `.ogg` or `.opus` extensions. This is correct WhatsApp behavior, not a bug β€” if voice bubbles are desired for TTS, the adapter would need to either convert to `.ogg` or add `ptt: true` to the bridge audio payload.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High β€” major feature broken, no workaroundcomp/gatewayGateway runner, session dispatch, deliveryplatform/whatsappWhatsApp Business adaptersweeper:implemented-on-mainSweeper: behavior already present on current maintool/ttsText-to-speech and transcriptiontype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions