fix(agent): keep image tool results from poisoning text-only sessions#25925
Merged
Conversation
Contributor
🔎 Lint report:
|
| Rule | Count |
|---|---|
invalid-argument-type |
3 |
First entries
run_agent.py:13751: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown, Unknown] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
run_agent.py:13748: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
run_agent.py:7480: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown, Unknown] | Any | ... omitted 3 union elements`
✅ Fixed issues (4):
| Rule | Count |
|---|---|
invalid-argument-type |
4 |
First entries
run_agent.py:7480: [invalid-argument-type] invalid-argument-type: Argument to function `build_anthropic_client` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:13711: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
run_agent.py:11522: [invalid-argument-type] invalid-argument-type: Method `__getitem__` of type `Overload[(key: SupportsIndex | slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> LiteralString, (key: SupportsIndex | slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> str]` cannot be called with key of type `Literal["content"]` on object of type `str`
run_agent.py:13714: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
Unchanged: 4384 pre-existing issues carried over.
Diagnostics are surfaced as warnings — this check never fails the build.
19 tasks
This was referenced May 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Salvage of #25903 onto current main, original commit by @helix4u preserved.
Summary
When a multimodal tool result (computer_use, vision_analyze, browser_vision) comes back on a text-only provider/model, store a text-only fallback into canonical history instead of the raw
image_urlcontent. For computer_use specifically, write a clear "switch to a vision-capable model" error JSON; other multimodal tools fall back to the result'stext_summary.Also adds DeepSeek's exact 400 wording (
unknown variant \image_url`, expected `text``) to the existing adaptive image-rejection recovery list so an already-poisoned session can self-heal on the next retry.Root cause
_prepare_messages_for_non_vision_modelruns on the legacy/codex_responses branches of_build_chat_kwargsbut not on the provider-profile branch (registered providers like DeepSeek). Tracked broadly as #23733; this PR fixes it at the tool-result write site, which keeps history clean across/compress,/resume,/model.Validation
scripts/run_tests.sh tests/tools/test_computer_use.py::TestRunAgentMultimodalHelpers tests/run_agent/test_vision_aware_preprocessing.py— 19/19 passCredit
Closes #25903. Original commit
ae3e79637by @helix4u cherry-picked onto current main with authorship preserved.Related: #23733, #23743, #23750, #24070.