Skip to content

fix(vision): resolve Nous vision model correctly in auto-detect path#12683

Closed
Ifkellx wants to merge 1 commit into
NousResearch:mainfrom
Ifkellx:fix/nous-vision-model
Closed

fix(vision): resolve Nous vision model correctly in auto-detect path#12683
Ifkellx wants to merge 1 commit into
NousResearch:mainfrom
Ifkellx:fix/nous-vision-model

Conversation

@Ifkellx

@Ifkellx Ifkellx commented Apr 19, 2026

Copy link
Copy Markdown
Contributor

Problem

The vision auto-detect chain calls resolve_provider_client() with the vision model from _PROVIDER_VISION_MODELS, but resolve_provider_client() always called _try_nous() without vision=True. This caused it to return the default text model instead of the vision-capable xiaomi/mimo-v2-omni, resulting in 404 errors from the Nous inference API when sending images.

Additionally, _PROVIDER_VISION_MODELS was missing an entry for the nous provider.

Root Cause

The auto-detect path in resolve_vision_provider_client():

  1. Looks up _PROVIDER_VISION_MODELS.get("nous") → returns xiaomi/mimo-v2-omni
  2. Calls resolve_provider_client("nous", model="xiaomi/mimo-v2-omni")
  3. resolve_provider_client calls _try_nous() without vision=True
  4. _try_nous() ignores the passed model, returns the default text model

The fallback path (_resolve_strict_vision_backend) worked correctly because it called _try_nous(vision=True) directly.

Fix

  1. _PROVIDER_VISION_MODELS: Added "nous": "xiaomi/mimo-v2-omni" entry so the vision auto-detect chain picks the correct multimodal model.

  2. resolve_provider_client: Auto-detects vision tasks by checking if the requested model matches a value in _PROVIDER_VISION_MODELS or is a known vision model name, then passes vision=True to _try_nous().

Verification

  • xiaomi/mimo-v2-omni returns HTTP 200 with image inputs on Nous inference API
  • google/gemini-3-flash-preview returns 404 with image inputs on Nous inference API
  • Free tier Nous accounts: only Xiaomi models are available, making this fix essential

Impact

Fixes browser_vision and vision_analyze tools for all Hermes users on Nous (both free and paid tiers).

When Nous Research is the main provider, vision tasks fail with 404
because _PROVIDER_VISION_MODELS has no entry for 'nous'. The auto-detect
falls back to the main model (e.g. xiaomi/mimo-v2-pro) which doesn't
support images, or to google/gemini-3-flash-preview which Nous also
rejects for image inputs.

This adds 'nous': 'xiaomi/mimo-v2-omni' to the vision model map, which
is the multimodal model available on Nous inference API and confirmed
to work with image inputs (HTTP 200).

Closes vision failures for all Nous provider users.
@Ifkellx Ifkellx closed this Apr 19, 2026
@Ifkellx Ifkellx changed the title fix(vision): add nous provider to vision model map fix(vision): resolve Nous vision model correctly in auto-detect path Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant