Skip to content

Bug: WhatsApp outbound media (MEDIA: token) fails silently — workaround: openclaw message send --media #54131

@felpsanchez

Description

@felpsanchez

Summary

When an agent includes a MEDIA: path in its reply text, the file is never delivered to WhatsApp. The outbound log always shows mediaUrl=null, mediaKind=null. No error is logged.

Environment

  • OpenClaw: 2026.3.23-2
  • macOS: Darwin 25.3.0 arm64 (Mac mini M4)
  • Gateway mode: LaunchDaemon (boot without login)
  • Channel: WhatsApp Web (Baileys)

Steps to Reproduce

  1. Agent generates an audio file (OGG Opus) in /workspace/media/nota.ogg
  2. Agent replies with MEDIA:media/nota.ogg or MEDIA:/absolute/path/nota.ogg
  3. WhatsApp receives a text message or nothing — no audio attachment

What I Tried

  • MEDIA:media/nota.ogg (relative) → fails silently
  • MEDIA:./media/nota.ogg → fails silently
  • MEDIA:/absolute/path.ogg → blocked by security policy (expected)
  • MEDIA:~/path.ogg → fails silently
  • MEDIA:http://127.0.0.1:PORT/nota.ogg → fails silently
  • MP3 format → fails silently
  • OGG Opus format → fails silently

Investigation

Traced through the source:

  1. splitMediaFromOutput correctly parses the MEDIA: token
  2. getAgentScopedMediaLocalRoots correctly includes the workspace path
  3. assertLocalMediaAllowed passes (workspace is in allowed roots)
  4. loadWebMedia appears to fail silently before populating mediaBuffer
  5. Result: hasMedia: false in outbound log, message sent as text-only

The gateway log shows "mediaUrl":null,"mediaSizeBytes":null,"mediaKind":null on every auto-reply send, even when the agent output contained a valid MEDIA: token with an existing, accessible file.

Workaround (Confirmed Working)

Use the CLI directly instead of the MEDIA: token:

# Generate audio (ElevenLabs + sag)
sag --voice VOICE_ID --model-id eleven_multilingual_v2 -o /tmp/tts.mp3 "text"

# Convert to OGG Opus (required format for WhatsApp voice notes)
ffmpeg -y -i /tmp/tts.mp3 -c:a libopus -b:a 64k -ar 48000 -ac 1 -application voip /workspace/media/nota.ogg

# Send via CLI — THIS WORKS
openclaw message send --channel whatsapp --target "+57XXXXXXXXXX" --media "/workspace/media/nota.ogg"

The openclaw message send --media command delivers the file correctly as a WhatsApp voice note (PTT) with ptt: true.

Notes

  • The MEDIA: token works correctly on other channels (not tested, but the code path looks correct)
  • This may be specific to the WhatsApp auto-reply pipeline vs the direct send pipeline
  • The bug spent ~3 hours of investigation before the workaround was found via community PDF
  • Running as LaunchDaemon (not LaunchAgent) — unsure if this affects file system access

Thanks for the amazing tool 🦞

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions