[Feature] computer_use: route screenshots through auxiliary.vision when main model lacks vision

**Problem:** The `computer_use` tool captures screenshots correctly but cannot describe their visual content when the main model (e.g., MiniMax/M2) lacks vision capability. Screenshots are returned as base64 images in tool results, but `_tool_result_content_for_active_model()` in `run_agent.py:3327` checks `_model_supports_vision()` on the **main model only** — it does not route to `auxiliary.vision`.

**Current flow:**
```
computer_use captures screenshot → returns as _multimodal tool result →
fed back to main model (MiniMax/M2) → _model_supports_vision() returns False →
error: \"computer_use returned screenshot/image content, but the active model/provider does not support image input\"
```

**auxiliary.vision only applies to the `vision_analyze` tool**, not `computer_use`. The `computer_use` tool results are always processed by the main model, regardless of auxiliary.vision config.

**Reproduction:**
1. Set `model.default = MiniMax/M2`, `model.provider = minimax`
2. Configure `auxiliary.vision.provider = openrouter`, `auxiliary.vision.model = nvidia/nemotron-nano-12b-v2-vl:free`
3. Use computer_use with action=capture — screenshot captured successfully
4. Error returned: main model does not support image input

**Proposed fix:**
Patch `_tool_result_content_for_active_model()` (or add a routing check in `run_agent.py`) so that when:
- Tool name is `computer_use`
- Result has image content (`_content_has_image_parts()` returns True)
- Main model does NOT support vision (`_model_supports_vision()` returns False)
- auxiliary.vision is configured

Then route the screenshot base64 through `resolve_vision_provider_client()` instead of returning an error.

**Alternative workaround for users:** Use `browser_vision` instead, which correctly routes through `auxiliary.vision`. Or manually use computer_use capture + send base64 to OpenRouter VL model separately.

**Affected area:** `run_agent.py` — tool result handling for multimodal results from non-vision main models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] computer_use: route screenshots through auxiliary.vision when main model lacks vision #29407

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature] computer_use: route screenshots through auxiliary.vision when main model lacks vision #29407

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions