Skip to content

fix(agent): keep image tool results from poisoning text-only sessions#25925

Merged
teknium1 merged 1 commit into
mainfrom
hermes/hermes-61c456c4
May 14, 2026
Merged

fix(agent): keep image tool results from poisoning text-only sessions#25925
teknium1 merged 1 commit into
mainfrom
hermes/hermes-61c456c4

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Salvage of #25903 onto current main, original commit by @helix4u preserved.

Summary

When a multimodal tool result (computer_use, vision_analyze, browser_vision) comes back on a text-only provider/model, store a text-only fallback into canonical history instead of the raw image_url content. For computer_use specifically, write a clear "switch to a vision-capable model" error JSON; other multimodal tools fall back to the result's text_summary.

Also adds DeepSeek's exact 400 wording (unknown variant \image_url`, expected `text``) to the existing adaptive image-rejection recovery list so an already-poisoned session can self-heal on the next retry.

Root cause

_prepare_messages_for_non_vision_model runs on the legacy/codex_responses branches of _build_chat_kwargs but not on the provider-profile branch (registered providers like DeepSeek). Tracked broadly as #23733; this PR fixes it at the tool-result write site, which keeps history clean across /compress, /resume, /model.

Validation

  • scripts/run_tests.sh tests/tools/test_computer_use.py::TestRunAgentMultimodalHelpers tests/run_agent/test_vision_aware_preprocessing.py — 19/19 pass

Credit

Closes #25903. Original commit ae3e79637 by @helix4u cherry-picked onto current main with authorship preserved.

Related: #23733, #23743, #23750, #24070.

@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-61c456c4 vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8338 on HEAD, 8340 on base (✅ -2)

🆕 New issues (3):

Rule Count
invalid-argument-type 3
First entries
run_agent.py:13751: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown, Unknown] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
run_agent.py:13748: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:7480: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`

✅ Fixed issues (4):

Rule Count
invalid-argument-type 4
First entries
run_agent.py:7480: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:13711: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:11522: [invalid-argument-type] invalid-argument-type: Method `__getitem__` of type `Overload[(key: SupportsIndex | slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> LiteralString, (key: SupportsIndex | slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> str]` cannot be called with key of type `Literal["content"]` on object of type `str`
run_agent.py:13714: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`

Unchanged: 4384 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

Comment thread run_agent.py Dismissed
Comment thread run_agent.py Dismissed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround tool/vision Vision analysis and image generation type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants