fix(agent): keep image tool results from poisoning text-only sessions by helix4u · Pull Request #25903 · NousResearch/hermes-agent

helix4u · 2026-05-14T19:23:12Z

What does this PR do?

Prevents computer_use screenshot results from poisoning text-only model sessions.

When computer_use returns a screenshot result while the active model/provider does not support image input, Hermes now stores a clean tool error instead of appending raw image_url content to the canonical conversation history. That avoids the repeated 400 loop where every later user turn resends the rejected image message before the agent can recover.

This is intentionally narrower than the existing provider-profile image fallback PRs. It does not try to make a text-only model operate the desktop from an auxiliary vision description; it fails cleanly and tells the model/user to switch to a vision-capable model for desktop computer use.

Related Issue

Related: #23733

Related but not duplicate:

fix(api-server): apply non-vision image fallback #23743 / fix(agent): strip image parts for non-vision models on provider profile path #23750 cover the registered provider-profile image preprocessing path.
Fix/computer use aux vision routing #24070 routes computer_use captures through auxiliary.vision; this PR takes the safer text-only behavior for desktop control by returning a tool error instead of giving the main text-only model a generated screenshot description to act on.

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
🔒 Security fix
📝 Documentation update
✅ Tests (adding or improving test coverage)
♻️ Refactor (no behavior change)
🎯 New skill (bundled or hub)

Changes Made

run_agent.py: added _tool_result_content_for_active_model() so multimodal tool results are adapted before they enter session history.
run_agent.py: converts computer_use screenshot results into a clear tool error for text-only active models/providers.
run_agent.py: preserves screenshot/image tool results for vision-capable active models.
run_agent.py: recognizes DeepSeek's exact text-only image rejection wording in adaptive recovery.
tests/tools/test_computer_use.py: covers text-only and vision-capable handling for computer_use multimodal results.

How to Test

Use computer_use with a text-only OpenAI-compatible provider such as direct DeepSeek.
Confirm Hermes records a tool error instead of a raw image_url tool result.
Continue the same session and confirm later turns do not repeat the same provider-side image deserialization 400.

Targeted tests run locally:

pytest -q tests/tools/test_computer_use.py::TestRunAgentMultimodalHelpers tests/run_agent/test_vision_aware_preprocessing.py

Result: 19 passed in 2.89s

Full suite run locally:

scripts/run_tests.sh

Result: failed in 0:11:36 with 62 failed, 22918 passed, 61 skipped, 244 warnings, 19 errors.

The full-suite failures/errors are broad existing/global areas outside this PR's two touched files. Examples include tests/cli/test_cli_save_config_value.py, tests/agent/lsp/test_client_e2e.py, tests/gateway/test_approve_deny_commands.py, tests/run_agent/test_compression_feasibility.py, process/terminal timeout cleanup tests, and tests/tools/test_browser_supervisor.py teardown errors from the live-system guard blocking os.kill(...) on spawned Chrome PIDs.

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix/feature (no unrelated commits)
I've run pytest tests/ -q and all tests pass
I've added tests for my changes (required for bug fixes, strongly encouraged for features)
I've tested on my platform: WSL2 / Linux targeted unit tests

Documentation & Housekeeping

I've updated relevant documentation (README, docs/, docstrings) — or N/A
I've updated cli-config.yaml.example if I added/changed config keys — or N/A
I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
I've updated tool descriptions/schemas if I changed tool behavior — or N/A

For New Skills

N/A

Screenshots / Logs

Support repro showed direct DeepSeek rejecting the post-computer_use request with:

messages[10]: unknown variant image_url, expected text

teknium1 · 2026-05-14T21:52:29Z

Merged via #25925 — your commit ae3e79637 was cherry-picked onto current main with authorship preserved in git log. Thanks for the fix! The poisoned-history class is now blocked at the tool-result write site, and DeepSeek's exact 400 wording is in the adaptive recovery list so any already-poisoned session can self-heal.

fix(agent): keep image tool results from poisoning text-only sessions

ae3e796

helix4u force-pushed the fix/text-only-image-tool-results branch from ec5ae8b to ae3e796 Compare May 14, 2026 19:28

alt-glitch added type/bug Something isn't working comp/agent Core agent loop, run_agent.py, prompt builder tool/vision Vision analysis and image generation P2 Medium — degraded but workaround exists labels May 14, 2026

helix4u marked this pull request as ready for review May 14, 2026 19:48

teknium1 mentioned this pull request May 14, 2026

fix(agent): keep image tool results from poisoning text-only sessions #25925

Merged

teknium1 closed this in #25925 May 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): keep image tool results from poisoning text-only sessions#25903

fix(agent): keep image tool results from poisoning text-only sessions#25903
helix4u wants to merge 1 commit into
NousResearch:mainfrom
helix4u:fix/text-only-image-tool-results

helix4u commented May 14, 2026 •

edited

Loading

Uh oh!

teknium1 commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

helix4u commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Related Issue

Type of Change

Changes Made

How to Test

Checklist

Code

Documentation & Housekeeping

For New Skills

Screenshots / Logs

Uh oh!

teknium1 commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

helix4u commented May 14, 2026 •

edited

Loading