Skip to content

WhatsApp media delivery fails for generated images, videos, and tool-created media outputs #19105

@hedirman

Description

@hedirman

Summary

WhatsApp media delivery can fail or behave inconsistently even when Hermes successfully receives or generates the media file.

Observed workflows:

  1. Sending a video/photo to Hermes over WhatsApp.
  2. Asking Hermes to generate an image.
  3. Asking Hermes to generate media output such as an ASCII-video MP4.

The failures appear across several related paths:

  • Inbound WhatsApp videos can be surfaced to the model as placeholders instead of real cached file paths.
  • image_generate returns JSON with an image path, but the gateway previously only auto-delivered tool results that already contained MEDIA:/path.
  • Prior image-generation JSON results in session history can be rediscovered and re-sent, causing WhatsApp to send all previously generated images instead of only the latest.
  • send_message supports MEDIA:/path native delivery for several platforms, but not WhatsApp.
  • The WhatsApp bridge can send generated images in a way that appears as a file/document instead of a native image bubble.

Observed Behavior

  • A WhatsApp video/photo can arrive as [video received] or similar without a usable local path.
  • image_generate saves a PNG successfully, but WhatsApp delivery fails or only returns the path.
  • Subsequent image generations may send multiple old generated images from conversation history.
  • A tool-created MP4, such as an ASCII-video output, may be described in prose instead of attached as native media.
  • Generated images may arrive as file/document attachments instead of inline image messages.

Expected Behavior

  • Inbound WhatsApp video/photo attachments should expose their real local cached paths to the agent.
  • Successful generated media outputs should be delivered as native WhatsApp attachments when Hermes has a real local path.
  • Only media generated in the current turn should be delivered, not older media paths from history.
  • send_message should support MEDIA:/path for WhatsApp using the live WhatsApp bridge adapter.
  • PNG/JPEG/WebP/GIF outputs should appear as native WhatsApp images, not documents.
  • MP4 outputs, including ASCII-video outputs, should route through WhatsApp as native video attachments when the file exists.

Fix Direction

  • Normalize successful image-generation tool JSON results into MEDIA:/path at the gateway layer.
  • Extract historical media paths from both literal MEDIA:/path tags and image-generation JSON payloads to prevent re-sending old images.
  • Preserve existing MEDIA:/path behavior for tools that already emit it, including generated MP4 outputs.
  • Route WhatsApp send_message media through the live WhatsApp adapter.
  • Use native WhatsApp image/video/audio payloads in the bridge.
  • Include inbound video local paths in model-facing message text.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/gatewayGateway runner, session dispatch, deliveryplatform/whatsappWhatsApp Business adaptertool/visionVision analysis and image generationtype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions