fix(vision): resolve Nous vision model correctly in auto-detect path#12683
Closed
Ifkellx wants to merge 1 commit into
Closed
fix(vision): resolve Nous vision model correctly in auto-detect path#12683Ifkellx wants to merge 1 commit into
Ifkellx wants to merge 1 commit into
Conversation
When Nous Research is the main provider, vision tasks fail with 404 because _PROVIDER_VISION_MODELS has no entry for 'nous'. The auto-detect falls back to the main model (e.g. xiaomi/mimo-v2-pro) which doesn't support images, or to google/gemini-3-flash-preview which Nous also rejects for image inputs. This adds 'nous': 'xiaomi/mimo-v2-omni' to the vision model map, which is the multimodal model available on Nous inference API and confirmed to work with image inputs (HTTP 200). Closes vision failures for all Nous provider users.
19 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The vision auto-detect chain calls
resolve_provider_client()with the vision model from_PROVIDER_VISION_MODELS, butresolve_provider_client()always called_try_nous()withoutvision=True. This caused it to return the default text model instead of the vision-capablexiaomi/mimo-v2-omni, resulting in 404 errors from the Nous inference API when sending images.Additionally,
_PROVIDER_VISION_MODELSwas missing an entry for thenousprovider.Root Cause
The auto-detect path in
resolve_vision_provider_client():_PROVIDER_VISION_MODELS.get("nous")→ returnsxiaomi/mimo-v2-omniresolve_provider_client("nous", model="xiaomi/mimo-v2-omni")resolve_provider_clientcalls_try_nous()withoutvision=True_try_nous()ignores the passed model, returns the default text modelThe fallback path (
_resolve_strict_vision_backend) worked correctly because it called_try_nous(vision=True)directly.Fix
_PROVIDER_VISION_MODELS: Added"nous": "xiaomi/mimo-v2-omni"entry so the vision auto-detect chain picks the correct multimodal model.resolve_provider_client: Auto-detects vision tasks by checking if the requested model matches a value in_PROVIDER_VISION_MODELSor is a known vision model name, then passesvision=Trueto_try_nous().Verification
xiaomi/mimo-v2-omnireturns HTTP 200 with image inputs on Nous inference APIgoogle/gemini-3-flash-previewreturns 404 with image inputs on Nous inference APIImpact
Fixes
browser_visionandvision_analyzetools for all Hermes users on Nous (both free and paid tiers).