Problem
When a user sends an image or voice message on WhatsApp/Telegram and the configured runtime doesn't support that media type (e.g., Codex/OpenCode for any media, Claude for audio/video), the system should inform the user rather than silently dropping or degrading the media.
Currently, unsupported media is either:
- Silently converted to a text description via
applyMediaUnderstanding() (if API keys are configured)
- Silently dropped (if no fallback is configured)
Neither path tells the user what happened.
Proposed behavior
When runtime handles media natively
No user-visible change — media flows through to the runtime.
When middleware fallback handles media (STT, vision API)
Transparent to user — media is processed, result is included in prompt. Optionally note the conversion in a status/debug channel.
When no handler is available for the media type
Inform the user with a clear message. Examples:
- Image with no support: "⚠️ Your image was received but the current runtime (codex) doesn't support image input. The image content was not included in the conversation."
- Voice message with no STT: "⚠️ Voice message received but speech-to-text is not configured. Please send your message as text."
- Video with no support: "⚠️ Video attachments are not supported by the current runtime (claude). Only the text caption was included."
Where to communicate
The notification should be:
- Appended to the
ReplyPayload as a prefix/suffix warning (visible in the agent's reply)
- OR sent as a separate message before the agent reply
- Configurable:
agents.defaults.mediaFallbackNotice: "inline" | "separate" | "silent" (default: "inline")
Implementation
ChannelBridge media routing decision point
After checking runtime.mediaCapabilities:
for (const attachment of message.media) {
const supported = runtime.mediaCapabilities?.acceptsInbound?.some(
prefix => attachment.mimeType.startsWith(prefix)
);
if (supported) {
// pass through to runtime
nativeMedia.push(attachment);
} else if (hasFallback(attachment.mimeType)) {
// middleware handles (STT, vision API)
fallbackMedia.push(attachment);
} else {
// no handler — notify user
unsupportedMedia.push(attachment);
}
}
Notification format
Keep notifications concise and actionable:
- State what was received
- State why it couldn't be processed
- Suggest an alternative if available
Depends on
Related
Problem
When a user sends an image or voice message on WhatsApp/Telegram and the configured runtime doesn't support that media type (e.g., Codex/OpenCode for any media, Claude for audio/video), the system should inform the user rather than silently dropping or degrading the media.
Currently, unsupported media is either:
applyMediaUnderstanding()(if API keys are configured)Neither path tells the user what happened.
Proposed behavior
When runtime handles media natively
No user-visible change — media flows through to the runtime.
When middleware fallback handles media (STT, vision API)
Transparent to user — media is processed, result is included in prompt. Optionally note the conversion in a status/debug channel.
When no handler is available for the media type
Inform the user with a clear message. Examples:
Where to communicate
The notification should be:
ReplyPayloadas a prefix/suffix warning (visible in the agent's reply)agents.defaults.mediaFallbackNotice: "inline" | "separate" | "silent"(default:"inline")Implementation
ChannelBridge media routing decision point
After checking
runtime.mediaCapabilities:Notification format
Keep notifications concise and actionable:
Depends on
mediaCapabilities)Related