Problem
qwen3.7-plus supports multimodal input (image + video), but the current modality detection logic treats it as text-only.
The defaultModalities() function in modalityDefaults.ts uses ordered regex matching. Since there is no explicit pattern for qwen3.7-plus, it falls through to the catch-all /^qwen/ pattern which returns {} (text-only).
Meanwhile, qwen3.7-max is correctly text-only — it only supports text input.
Expected Behavior
qwen3.7-plus → { image: true, video: true } (multimodal)
qwen3.7-max → {} (text-only, no change needed)
Root Cause
Two places need updating:
packages/core/src/core/modalityDefaults.ts — MODALITY_PATTERNS is missing an entry for qwen3.7-plus. It falls through to /^qwen/ → {}.
packages/core/src/core/openaiContentGenerator/provider/dashscope.ts — VISION_MODEL_PREFIX_PATTERNS doesn't include qwen3.7-plus, so the DashScope provider won't set vl_high_resolution_images: true for it.
Proposed Fix
- Add
[/^qwen3\.7-plus/, { image: true, video: true }] to MODALITY_PATTERNS before the catch-all /^qwen/.
- Add
'qwen3.7-plus' to VISION_MODEL_PREFIX_PATTERNS in the DashScope provider.
- Add/update tests to verify the modality defaults for both
qwen3.7-plus (multimodal) and qwen3.7-max (text-only).
Impact
Users selecting qwen3.7-plus through any provider (token plan, custom config) will correctly get multimodal support — images and videos will be sent as inline data instead of being downgraded to text placeholders.
Problem
qwen3.7-plussupports multimodal input (image + video), but the current modality detection logic treats it as text-only.The
defaultModalities()function inmodalityDefaults.tsuses ordered regex matching. Since there is no explicit pattern forqwen3.7-plus, it falls through to the catch-all/^qwen/pattern which returns{}(text-only).Meanwhile,
qwen3.7-maxis correctly text-only — it only supports text input.Expected Behavior
qwen3.7-plus→{ image: true, video: true }(multimodal)qwen3.7-max→{}(text-only, no change needed)Root Cause
Two places need updating:
packages/core/src/core/modalityDefaults.ts—MODALITY_PATTERNSis missing an entry forqwen3.7-plus. It falls through to/^qwen/→{}.packages/core/src/core/openaiContentGenerator/provider/dashscope.ts—VISION_MODEL_PREFIX_PATTERNSdoesn't includeqwen3.7-plus, so the DashScope provider won't setvl_high_resolution_images: truefor it.Proposed Fix
[/^qwen3\.7-plus/, { image: true, video: true }]toMODALITY_PATTERNSbefore the catch-all/^qwen/.'qwen3.7-plus'toVISION_MODEL_PREFIX_PATTERNSin the DashScope provider.qwen3.7-plus(multimodal) andqwen3.7-max(text-only).Impact
Users selecting
qwen3.7-plusthrough any provider (token plan, custom config) will correctly get multimodal support — images and videos will be sent as inline data instead of being downgraded to text placeholders.