Summary
WhatsApp media delivery can fail or behave inconsistently even when Hermes successfully receives or generates the media file.
Observed workflows:
- Sending a video/photo to Hermes over WhatsApp.
- Asking Hermes to generate an image.
- Asking Hermes to generate media output such as an ASCII-video MP4.
The failures appear across several related paths:
- Inbound WhatsApp videos can be surfaced to the model as placeholders instead of real cached file paths.
image_generate returns JSON with an image path, but the gateway previously only auto-delivered tool results that already contained MEDIA:/path.
- Prior image-generation JSON results in session history can be rediscovered and re-sent, causing WhatsApp to send all previously generated images instead of only the latest.
send_message supports MEDIA:/path native delivery for several platforms, but not WhatsApp.
- The WhatsApp bridge can send generated images in a way that appears as a file/document instead of a native image bubble.
Observed Behavior
- A WhatsApp video/photo can arrive as
[video received] or similar without a usable local path.
image_generate saves a PNG successfully, but WhatsApp delivery fails or only returns the path.
- Subsequent image generations may send multiple old generated images from conversation history.
- A tool-created MP4, such as an ASCII-video output, may be described in prose instead of attached as native media.
- Generated images may arrive as file/document attachments instead of inline image messages.
Expected Behavior
- Inbound WhatsApp video/photo attachments should expose their real local cached paths to the agent.
- Successful generated media outputs should be delivered as native WhatsApp attachments when Hermes has a real local path.
- Only media generated in the current turn should be delivered, not older media paths from history.
send_message should support MEDIA:/path for WhatsApp using the live WhatsApp bridge adapter.
- PNG/JPEG/WebP/GIF outputs should appear as native WhatsApp images, not documents.
- MP4 outputs, including ASCII-video outputs, should route through WhatsApp as native video attachments when the file exists.
Fix Direction
- Normalize successful image-generation tool JSON results into
MEDIA:/path at the gateway layer.
- Extract historical media paths from both literal
MEDIA:/path tags and image-generation JSON payloads to prevent re-sending old images.
- Preserve existing
MEDIA:/path behavior for tools that already emit it, including generated MP4 outputs.
- Route WhatsApp
send_message media through the live WhatsApp adapter.
- Use native WhatsApp image/video/audio payloads in the bridge.
- Include inbound video local paths in model-facing message text.
Summary
WhatsApp media delivery can fail or behave inconsistently even when Hermes successfully receives or generates the media file.
Observed workflows:
The failures appear across several related paths:
image_generatereturns JSON with animagepath, but the gateway previously only auto-delivered tool results that already containedMEDIA:/path.send_messagesupportsMEDIA:/pathnative delivery for several platforms, but not WhatsApp.Observed Behavior
[video received]or similar without a usable local path.image_generatesaves a PNG successfully, but WhatsApp delivery fails or only returns the path.Expected Behavior
send_messageshould supportMEDIA:/pathfor WhatsApp using the live WhatsApp bridge adapter.Fix Direction
MEDIA:/pathat the gateway layer.MEDIA:/pathtags and image-generation JSON payloads to prevent re-sending old images.MEDIA:/pathbehavior for tools that already emit it, including generated MP4 outputs.send_messagemedia through the live WhatsApp adapter.