Skip to content

Fix WhatsApp native delivery for generated media#19106

Closed
hedirman wants to merge 2 commits into
NousResearch:mainfrom
hedirman:fix/whatsapp-generated-media-delivery
Closed

Fix WhatsApp native delivery for generated media#19106
hedirman wants to merge 2 commits into
NousResearch:mainfrom
hedirman:fix/whatsapp-generated-media-delivery

Conversation

@hedirman

@hedirman hedirman commented May 3, 2026

Copy link
Copy Markdown

Summary

Fixes #19105.

Fixes several WhatsApp media delivery paths:

  • Converts successful image_generate JSON results into MEDIA:/path attachments at the gateway layer.
  • Prevents old generated images from being re-sent by extracting history media paths from both MEDIA: tags and JSON tool payloads.
  • Adds native WhatsApp MEDIA: support to send_message through the live WhatsApp bridge adapter.
  • Sends WhatsApp images/videos/audio through native Baileys file URL payloads instead of treating them like document-style uploads.
  • Adds inbound WhatsApp video paths to the model-facing message so tools can process the real cached file.
  • Adds regression coverage for image JSON extraction, history dedupe, WhatsApp media routing, and inbound video path visibility.

Why

Generated media could succeed locally but fail or degrade at WhatsApp delivery.

For image generation, providers return JSON with an image field rather than a MEDIA:/path tag. The gateway therefore depended on the model restating the exact path correctly. Once JSON extraction was added, old image JSON in history also needed to be treated as already delivered, otherwise Hermes could send every previous generated image again.

For generated MP4 outputs, such as ASCII-video renders, delivery depends on MEDIA:/path routing correctly through WhatsApp.

For inbound WhatsApp videos, the agent needs the real cached path, not just a placeholder.

Tests

PYTHONDONTWRITEBYTECODE=1 python -m pytest \
  tests/gateway/test_media_extraction.py \
  tests/gateway/test_native_image_buffer_isolation.py \
  tests/tools/test_send_message_tool.py::TestSendToPlatformWhatsapp -q

Result:

12 passed

Also ran:

PYTHONDONTWRITEBYTECODE=1 python -m pytest tests/tools/test_send_message_tool.py -q

Result:

96 passed

Also checked syntax:

node --check scripts/whatsapp-bridge/bridge.js
python -c "import ast, pathlib; ast.parse(pathlib.Path('gateway/run.py').read_text()); ast.parse(pathlib.Path('tools/send_message_tool.py').read_text())"

Notes

This fixes the gateway and WhatsApp delivery side. Tools still need to create real output files and emit or return usable local paths. For ASCII-video specifically, an MP4 output should now be sent successfully when the skill/tool produces an actual file and returns it as MEDIA:/path/to/output.mp4.

@alt-glitch alt-glitch added type/bug Something isn't working platform/whatsapp WhatsApp Business adapter comp/gateway Gateway runner, session dispatch, delivery tool/vision Vision analysis and image generation P1 High — major feature broken, no workaround labels May 3, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #19105 — this PR fixes the issue.

1 similar comment
@alt-glitch

Copy link
Copy Markdown
Collaborator

Related to #19105 — this PR fixes the issue.

@hedirman hedirman force-pushed the fix/whatsapp-generated-media-delivery branch from b262bdd to a099900 Compare May 3, 2026 23:01
@teknium1

teknium1 commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Superseded by independent work on main. Since this PR was filed (May 3), the WhatsApp native-media subsystem was rebuilt over the intervening commits — current main already has:

  • WhatsApp send_message MEDIA: routing (whatsapp is in _PHONE_PLATFORMS in tools/send_message_tool.py)
  • Native image/video/audio sends via the bridge /send-media endpoint + adapter send_image/send_video/send_audio (gateway/platforms/whatsapp.py)
  • Inbound video/photo caching that surfaces real local paths to the agent
  • Turn-scoped history dedupe so old generated media isn't re-sent (_history_media_paths exclusion in gateway/run.py)

The one path your PR raised that is genuinely still open is deterministic auto-delivery of image_generate output (it returns JSON {"image": ...} with no MEDIA: tag, so delivery currently depends on the model restating the path). We're tracking that as a focused follow-up.

Thanks for the thorough write-up in #19105 — the issue's path-by-path breakdown is what made this easy to verify. Closing in favor of the work already on main; your problem report drove it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/gateway Gateway runner, session dispatch, delivery P1 High — major feature broken, no workaround platform/whatsapp WhatsApp Business adapter tool/vision Vision analysis and image generation type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WhatsApp media delivery fails for generated images, videos, and tool-created media outputs

3 participants