Bug Description
This issue was reported by Hermes Agent on behalf of @oxngon.
Bug Description
When sending /btw <prompt> (or /background <prompt>) with image attachments on Discord (and likely other platforms), the images are cached to disk by the adapter but never forwarded to the spawned background agent. The agent receives only the text prompt and has no access to the attached images.
Steps to Reproduce
Steps to Reproduce
- Send a message on Discord:
/btw Which one should I buy? with 1-2 images attached
- The background agent responds without any knowledge of the images
- The images were cached (visible in logs) but never passed to the agent
Expected Behavior
Expected Behavior
The background agent should receive and process image attachments, just like a regular message does.
Actual Behavior
Actual Behavior
The background agent only receives the text prompt. Images are silently dropped.
Affected Component
Gateway (Telegram/Discord/Slack/WhatsApp)
Messaging Platform (if gateway-related)
Discord
Debug Report
Debug report: https://paste.rs/g9Idk
agent.log: https://paste.rs/w4GOl
gateway.log: https://paste.rs/qTEIe
Operating System
Ubuntu 24.04
Python Version
3.11.15
Hermes Version
v0.13.0 (2026.5.7)
Additional Logs / Traceback (optional)
### Notes
- The `/btw` command is an alias for `/background` — both are affected
- Regular (non-command) text+image messages work correctly — the image processing in `handle_message` (line ~6731) handles them properly
- The fix is backward-compatible: background tasks without images continue to work identically
Root Cause Analysis (optional)
Root Cause
In gateway/run.py, _handle_background_command (line ~10164) extracts only the text via event.get_command_args().strip() and passes it to _run_background_task. The event.media_urls and event.media_types fields are never forwarded.
The _run_background_task method then calls agent.run_conversation(user_message=prompt) with the bare text prompt — no image context.
Proposed Fix (optional)
Proposed Fix
- Forward
event.media_urls and event.media_types from _handle_background_command to _run_background_task
- In
_run_background_task, before calling agent.run_conversation(), enrich the prompt with vision descriptions using the existing _enrich_message_with_vision() method (the same path used for regular messages)
This reuses the existing vision infrastructure and requires minimal code changes.
Are you willing to submit a PR for this?
Bug Description
This issue was reported by Hermes Agent on behalf of @oxngon.
Bug Description
When sending
/btw <prompt>(or/background <prompt>) with image attachments on Discord (and likely other platforms), the images are cached to disk by the adapter but never forwarded to the spawned background agent. The agent receives only the text prompt and has no access to the attached images.Steps to Reproduce
Steps to Reproduce
/btw Which one should I buy?with 1-2 images attachedExpected Behavior
Expected Behavior
The background agent should receive and process image attachments, just like a regular message does.
Actual Behavior
Actual Behavior
The background agent only receives the text prompt. Images are silently dropped.
Affected Component
Gateway (Telegram/Discord/Slack/WhatsApp)
Messaging Platform (if gateway-related)
Discord
Debug Report
Operating System
Ubuntu 24.04
Python Version
3.11.15
Hermes Version
v0.13.0 (2026.5.7)
Additional Logs / Traceback (optional)
Root Cause Analysis (optional)
Root Cause
In
gateway/run.py,_handle_background_command(line ~10164) extracts only the text viaevent.get_command_args().strip()and passes it to_run_background_task. Theevent.media_urlsandevent.media_typesfields are never forwarded.The
_run_background_taskmethod then callsagent.run_conversation(user_message=prompt)with the bare text prompt — no image context.Proposed Fix (optional)
Proposed Fix
event.media_urlsandevent.media_typesfrom_handle_background_commandto_run_background_task_run_background_task, before callingagent.run_conversation(), enrich the prompt with vision descriptions using the existing_enrich_message_with_vision()method (the same path used for regular messages)This reuses the existing vision infrastructure and requires minimal code changes.
Are you willing to submit a PR for this?