Bug Description
/v1/chat/completions accepts OpenAI-style multimodal content with image_url parts. When the active main model does not support vision (e.g. deepseek-v4-pro), AIAgent._prepare_messages_for_non_vision_model (run_agent.py:8971) is supposed to replace each image part with a cached vision_analyze text description so the request doesn't fail at the provider.
That fallback runs on the legacy (unregistered-provider) branch of _build_chat_kwargs but not on the provider-profile branch — which is the path every registered provider takes (opencode-zen, opencode-go, deepseek, kimi, openrouter, gemini, anthropic, etc.). The profile branch forwards image_url parts unchanged to the upstream provider, which fails with HTTP 400 on text-only models.
The codex_responses path (run_agent.py:9267) does still call the fallback, and the gateway-adapter path (gateway/run.py → _decide_image_input_mode → _enrich_message_with_vision) also handles it correctly — so the inconsistency is specifically on the chat_completions profile branch.
Steps to Reproduce
- Configure Hermes with a non-vision main model on a registered provider, e.g.
provider=opencode-go, model=deepseek-v4-pro.
- POST
/v1/chat/completions with a user message in the OpenAI multimodal array form:
{
"model": "hermes-agent",
"messages": [
{"role": "user", "content": [
{"type": "text", "text": "what's in this image?"},
{"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}}
]}
]
}
- The agent reaches
run_conversation → _build_chat_kwargs → profile branch (run_agent.py:9354), then ships the image_url part to DeepSeek via opencode-go.
Expected Behavior
Same as the legacy branch and the gateway-adapter path: image parts are replaced with vision_analyze text inline, the provider sees text-only content, and the turn succeeds. Whether a provider is registered should not change the user-visible behavior of /v1/chat/completions.
Actual Behavior
Provider returns HTTP 400:
run_agent: conversation turn: ... model=deepseek-v4-pro provider=opencode-go platform=api_server history=1
msg='[1 image] <recent_messages> ... </recent_messages> <cur...'
run_agent: Streaming failed before delivery: Error code: 400 -
{'error': {'message': "Error from provider (DeepSeek): Failed to deserialize the
JSON body into the target type: messages[2]: unknown variant `image_url`,
expected `text` at line 1 column 7586", ...}}
aiohttp.access: ... "POST /v1/chat/completions HTTP/1.1" 502 802 ...
The retry loop classifies the 400 as BadRequestError, non-retryable, and the request 502s.
Root Cause Analysis
_prepare_messages_for_non_vision_model (run_agent.py:8971, introduced by PR #16506) is called in _build_chat_kwargs at run_agent.py:9389, inside the legacy branch:
# Legacy flag path — reached only when get_provider_profile() returns None.
...
# Strip image parts for non-vision models (no-op when vision-capable).
_msgs_for_chat = self._prepare_messages_for_non_vision_model(api_messages)
return _ct.build_kwargs(model=..., messages=_msgs_for_chat, ...)
The profile branch above it (run_agent.py:9354), added by the later provider-modules refactor ("transport single-path"), has no equivalent call:
if _profile:
...
return _ct.build_kwargs(
model=self.model,
messages=api_messages, # ← image_url parts pass through unchanged
...
provider_profile=_profile,
...
)
Downstream, agent/transports/chat_completions.py::_build_kwargs_from_profile (line 393) calls profile.prepare_messages(sanitized) (line 402), but the default ProviderProfile.prepare_messages (providers/base.py:80) is pass-through and no bundled profile overrides it for vision handling. So there is no substitute path.
Because get_provider_profile() now returns a profile for every major provider, the non-vision fallback is effectively unreachable for real users on /v1/chat/completions.
Proposed Fix
Add the same call into the profile branch:
if _profile:
_ephemeral_out = getattr(self, "_ephemeral_max_output_tokens", None)
if _ephemeral_out is not None:
self._ephemeral_max_output_tokens = None
+ # Strip image parts for non-vision models (no-op when vision-capable).
+ api_messages = self._prepare_messages_for_non_vision_model(api_messages)
+
return _ct.build_kwargs(
model=self.model,
messages=api_messages,
...
provider_profile=_profile,
...
)
_prepare_messages_for_non_vision_model short-circuits when no message contains image parts and again when _model_supports_vision() returns True, so the change is free for vision-capable / no-image cases.
Environment
- Verified on
origin/main commit 64145a199 and v0.13.0 (2026-05-07). Profile-branch refactor and _prepare_messages_for_non_vision_model both landed before v0.13.0; no fix in the 94 commits between v0.13.0 and current main.
- OS: Debian GNU/Linux 13 (trixie)
Bug Description
/v1/chat/completionsaccepts OpenAI-style multimodal content withimage_urlparts. When the active main model does not support vision (e.g.deepseek-v4-pro),AIAgent._prepare_messages_for_non_vision_model(run_agent.py:8971) is supposed to replace each image part with a cachedvision_analyzetext description so the request doesn't fail at the provider.That fallback runs on the legacy (unregistered-provider) branch of
_build_chat_kwargsbut not on the provider-profile branch — which is the path every registered provider takes (opencode-zen,opencode-go,deepseek,kimi,openrouter,gemini,anthropic, etc.). The profile branch forwardsimage_urlparts unchanged to the upstream provider, which fails with HTTP 400 on text-only models.The
codex_responsespath (run_agent.py:9267) does still call the fallback, and the gateway-adapter path (gateway/run.py→_decide_image_input_mode→_enrich_message_with_vision) also handles it correctly — so the inconsistency is specifically on the chat_completions profile branch.Steps to Reproduce
provider=opencode-go, model=deepseek-v4-pro./v1/chat/completionswith a user message in the OpenAI multimodal array form:{ "model": "hermes-agent", "messages": [ {"role": "user", "content": [ {"type": "text", "text": "what's in this image?"}, {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}} ]} ] }run_conversation→_build_chat_kwargs→ profile branch (run_agent.py:9354), then ships theimage_urlpart to DeepSeek via opencode-go.Expected Behavior
Same as the legacy branch and the gateway-adapter path: image parts are replaced with
vision_analyzetext inline, the provider sees text-only content, and the turn succeeds. Whether a provider is registered should not change the user-visible behavior of/v1/chat/completions.Actual Behavior
Provider returns HTTP 400:
The retry loop classifies the 400 as
BadRequestError, non-retryable, and the request 502s.Root Cause Analysis
_prepare_messages_for_non_vision_model(run_agent.py:8971, introduced by PR #16506) is called in_build_chat_kwargsatrun_agent.py:9389, inside the legacy branch:The profile branch above it (
run_agent.py:9354), added by the later provider-modules refactor ("transport single-path"), has no equivalent call:Downstream,
agent/transports/chat_completions.py::_build_kwargs_from_profile(line 393) callsprofile.prepare_messages(sanitized)(line 402), but the defaultProviderProfile.prepare_messages(providers/base.py:80) is pass-through and no bundled profile overrides it for vision handling. So there is no substitute path.Because
get_provider_profile()now returns a profile for every major provider, the non-vision fallback is effectively unreachable for real users on/v1/chat/completions.Proposed Fix
Add the same call into the profile branch:
if _profile: _ephemeral_out = getattr(self, "_ephemeral_max_output_tokens", None) if _ephemeral_out is not None: self._ephemeral_max_output_tokens = None + # Strip image parts for non-vision models (no-op when vision-capable). + api_messages = self._prepare_messages_for_non_vision_model(api_messages) + return _ct.build_kwargs( model=self.model, messages=api_messages, ... provider_profile=_profile, ... )_prepare_messages_for_non_vision_modelshort-circuits when no message contains image parts and again when_model_supports_vision()returns True, so the change is free for vision-capable / no-image cases.Environment
origin/maincommit64145a199and v0.13.0 (2026-05-07). Profile-branch refactor and_prepare_messages_for_non_vision_modelboth landed before v0.13.0; no fix in the 94 commits between v0.13.0 and current main.