fix: raise context_window ceiling and add qwen vision support#442
Merged
Conversation
- context_window.py: raise MAX_EFFECTIVE_CONTEXT_WINDOW from 65536 to 1000000 (safety cap, not a ceiling for reasonable configs). Add LARGE_CONTEXT_MODEL_DEFAULT=65536 to break the circular reference where default_context_window_for_model() returned MAX_EFFECTIVE_CONTEXT_WINDOW, making default=ceiling=65536. User-configured values (via WebUI or model_catalog.json) are now respected instead of being truncated by min(x, 65536). All default values remain unchanged (65536). - capabilities.py: add supports_vision=True to MODEL_OVERRIDES['qwen'] so that qwen models are recognized as vision-capable regardless of binding (custom, dashscope, etc.), not just when binding=openai.
…XT_WINDOW ceiling raise The test hardcoded 65536 as the expected effective context window, but after raising MAX_EFFECTIVE_CONTEXT_WINDOW to 1_000_000 the explicit context_window=128000 is no longer truncated. Update the assertion to match the new correct behavior.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fix two bugs that cause incorrect model capability detection at runtime:
Bug 1:
context_windowis hardcoded to 65536, truncating all user-configured valuesMAX_EFFECTIVE_CONTEXT_WINDOW = 65_536serves as both the safety ceiling and the default value returned bydefault_context_window_for_model(), creating a circular reference:default == ceiling == 65536. Combined withmin(configured, MAX_EFFECTIVE_CONTEXT_WINDOW), any user-configured value (via WebUI ormodel_catalog.json) is silently truncated to 65536.For example, when a user sets
context_window=128000for qwen3.6-plus:min(128000, 65536) = 65536❌min(128000, 1000000) = 128000✅Bug 2:
supports_vision=Falsefor qwen models whenbinding=customorbinding=dashscopeMODEL_OVERRIDES["qwen"]only setshas_thinking_tags: Truewithoutsupports_vision. Whenbinding=custom,PROVIDER_CAPABILITIEShas no"custom"entry, so it falls back toDEFAULT_CAPABILITIES["supports_vision"] = False. This means qwen models (which do support vision) are incorrectly treated as non-vision-capable unlessbinding=openai.Related Issues
Module(s) Affected
servicesapiconfigcoreknowledgeloggingtoolsutilsweb(Frontend)docs(Documentation)scriptstests...Checklist
pre-commit run --all-filesand fixed any issues.Additional Notes
Changes summary:
deeptutor/services/llm/context_window.pyMAX_EFFECTIVE_CONTEXT_WINDOW65536→1,000,000deeptutor/services/llm/context_window.pyLARGE_CONTEXT_MODEL_DEFAULT = 65_536deeptutor/services/llm/context_window.pydefault_context_window_for_model()returnsLARGE_CONTEXT_MODEL_DEFAULTMAX_EFFECTIVE_CONTEXT_WINDOW, breaking the circular referencedeeptutor/services/llm/capabilities.pysupports_vision: TruetoMODEL_OVERRIDES["qwen"]Priority logic after fix:
model_catalog.json) → used directly, only capped by 1M safety limitAll default values unchanged:
LARGE_CONTEXT_MODEL_DEFAULTkeeps65_536— identical to the current behavior. The only behavioral change is that user-configured values are no longer silently truncated.Why
MAX_EFFECTIVE_CONTEXT_WINDOW = 1_000_000instead of removing the cap entirely?The
min()ceiling serves as a safety guard against typos (e.g., accidentally entering100000000). 1M is well above any current model's context window while still catching unreasonable inputs. The key fix is decoupling the ceiling from the default value.Why add
LARGE_CONTEXT_MODEL_DEFAULTas a separate constant?Previously
default_context_window_for_model()returnedMAX_EFFECTIVE_CONTEXT_WINDOWfor large models, creating a circular reference where the default equaled the ceiling. By introducing a separateLARGE_CONTEXT_MODEL_DEFAULT, the two concerns are decoupled: the default can stay at 65536 (conservative, identical to current behavior) while the ceiling is raised to 1M (allowing user-configured values to pass through).Why add
supports_vision: TruetoMODEL_OVERRIDES["qwen"]instead of adding"custom"toPROVIDER_CAPABILITIES?Vision capability is a model-level property, not a binding-level one. qwen models support vision regardless of which binding (custom, dashscope, openai) is used. Adding it to
MODEL_OVERRIDESis the correct layer — consistent with howgpt-4o,claude-3,geminietc. are already configured.