Skip to content

cua-driver ignores auxiliary.vision config, uses main session model for image analysis #24015

@RootMePLS

Description

@RootMePLS

Bug Description

When using computer_use tool (backed by cua-driver), the image analysis for SOM (Set-of-Mark) mode ignores the auxiliary.vision configuration and instead uses the main session model. This causes 404 errors when the main model doesn't support image input.

Steps to Reproduce

  1. Configure Hermes Agent with:
    • Main model: tencent/hy3-preview (no image support on OpenRouter)
    • auxiliary.vision.provider: openrouter
    • auxiliary.vision.model: google/gemini-2.5-flash
  2. Enable computer_use toolset
  3. Call computer_use with action='capture', mode='som'

Expected Behavior

cua-driver should route the image analysis request to the model specified in auxiliary.vision (google/gemini-2.5-flash).

Actual Behavior

cua-driver attempts to use the main session model (tencent/hy3-preview) for image analysis, resulting in:

🔌 Provider: openrouter  Model: tencent/hy3-preview
📝 Error: HTTP 404: No endpoints found that support image input

Configuration

model:
  default: tencent/hy3-preview
  provider: openrouter
auxiliary:
  vision:
    provider: openrouter
    model: google/gemini-2.5-flash

Log Evidence

2026-05-11 ... ⚠️ API call failed (attempt1/3): NotFoundError [HTTP 404]
   🔌 Provider: openrouter  Model: tencent/hy3-preview
   📝 Error: HTTP 404: No endpoints found that support image input

Suggested Fix

Update cua-driver / computer_use tool to check for auxiliary.vision configuration and use that model/provider for image analysis tasks instead of the main session model.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/toolsTool registry, model_tools, toolsetstool/visionVision analysis and image generationtype/bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions