Skip to content

fix: raise context_window ceiling and add qwen vision support#442

Merged
pancacake merged 2 commits into
HKUDS:devfrom
wedone:fix/context-window-vision-upstream
May 8, 2026
Merged

fix: raise context_window ceiling and add qwen vision support#442
pancacake merged 2 commits into
HKUDS:devfrom
wedone:fix/context-window-vision-upstream

Conversation

@wedone

@wedone wedone commented May 4, 2026

Copy link
Copy Markdown
Contributor

Description

Fix two bugs that cause incorrect model capability detection at runtime:

Bug 1: context_window is hardcoded to 65536, truncating all user-configured values

MAX_EFFECTIVE_CONTEXT_WINDOW = 65_536 serves as both the safety ceiling and the default value returned by default_context_window_for_model(), creating a circular reference: default == ceiling == 65536. Combined with min(configured, MAX_EFFECTIVE_CONTEXT_WINDOW), any user-configured value (via WebUI or model_catalog.json) is silently truncated to 65536.

For example, when a user sets context_window=128000 for qwen3.6-plus:

  • Before: min(128000, 65536) = 65536
  • After: min(128000, 1000000) = 128000

Bug 2: supports_vision=False for qwen models when binding=custom or binding=dashscope

MODEL_OVERRIDES["qwen"] only sets has_thinking_tags: True without supports_vision. When binding=custom, PROVIDER_CAPABILITIES has no "custom" entry, so it falls back to DEFAULT_CAPABILITIES["supports_vision"] = False. This means qwen models (which do support vision) are incorrectly treated as non-vision-capable unless binding=openai.

Related Issues

  • Related to context_window 65536 limitation
  • Related to qwen vision capability detection

Module(s) Affected

  • services
  • api
  • config
  • core
  • knowledge
  • logging
  • tools
  • utils
  • web (Frontend)
  • docs (Documentation)
  • scripts
  • tests
  • Other: ...

Checklist

  • I have read and followed the contribution guidelines.
  • My code follows the project's coding standards.
  • I have run pre-commit run --all-files and fixed any issues.
  • I have added relevant tests for my changes.
  • I have updated the documentation (if necessary).
  • My changes do not introduce any new security vulnerabilities.

Additional Notes

Changes summary:

File Change Description
deeptutor/services/llm/context_window.py MAX_EFFECTIVE_CONTEXT_WINDOW 65536→1,000,000 Safety cap only, not a ceiling for reasonable configs
deeptutor/services/llm/context_window.py Add LARGE_CONTEXT_MODEL_DEFAULT = 65_536 Separate constant to break the circular reference (default == ceiling)
deeptutor/services/llm/context_window.py default_context_window_for_model() returns LARGE_CONTEXT_MODEL_DEFAULT No longer returns MAX_EFFECTIVE_CONTEXT_WINDOW, breaking the circular reference
deeptutor/services/llm/capabilities.py Add supports_vision: True to MODEL_OVERRIDES["qwen"] qwen models recognized as vision-capable regardless of binding

Priority logic after fix:

  1. User-configured value (via WebUI or model_catalog.json) → used directly, only capped by 1M safety limit
  2. Model name heuristic default (large models → 65536, small models → 16384)
  3. Fallback default (only when nothing else is available)

All default values unchanged: LARGE_CONTEXT_MODEL_DEFAULT keeps 65_536 — identical to the current behavior. The only behavioral change is that user-configured values are no longer silently truncated.

Why MAX_EFFECTIVE_CONTEXT_WINDOW = 1_000_000 instead of removing the cap entirely?

The min() ceiling serves as a safety guard against typos (e.g., accidentally entering 100000000). 1M is well above any current model's context window while still catching unreasonable inputs. The key fix is decoupling the ceiling from the default value.

Why add LARGE_CONTEXT_MODEL_DEFAULT as a separate constant?

Previously default_context_window_for_model() returned MAX_EFFECTIVE_CONTEXT_WINDOW for large models, creating a circular reference where the default equaled the ceiling. By introducing a separate LARGE_CONTEXT_MODEL_DEFAULT, the two concerns are decoupled: the default can stay at 65536 (conservative, identical to current behavior) while the ceiling is raised to 1M (allowing user-configured values to pass through).

Why add supports_vision: True to MODEL_OVERRIDES["qwen"] instead of adding "custom" to PROVIDER_CAPABILITIES?

Vision capability is a model-level property, not a binding-level one. qwen models support vision regardless of which binding (custom, dashscope, openai) is used. Adding it to MODEL_OVERRIDES is the correct layer — consistent with how gpt-4o, claude-3, gemini etc. are already configured.

WeDone and others added 2 commits May 4, 2026 22:49
- context_window.py: raise MAX_EFFECTIVE_CONTEXT_WINDOW from 65536 to
  1000000 (safety cap, not a ceiling for reasonable configs).
  Add LARGE_CONTEXT_MODEL_DEFAULT=65536 to break the circular
  reference where default_context_window_for_model() returned
  MAX_EFFECTIVE_CONTEXT_WINDOW, making default=ceiling=65536.
  User-configured values (via WebUI or model_catalog.json) are
  now respected instead of being truncated by min(x, 65536).
  All default values remain unchanged (65536).

- capabilities.py: add supports_vision=True to MODEL_OVERRIDES['qwen']
  so that qwen models are recognized as vision-capable regardless
  of binding (custom, dashscope, etc.), not just when binding=openai.
…XT_WINDOW ceiling raise

The test hardcoded 65536 as the expected effective context window,
but after raising MAX_EFFECTIVE_CONTEXT_WINDOW to 1_000_000 the
explicit context_window=128000 is no longer truncated. Update the
assertion to match the new correct behavior.
@wedone wedone changed the base branch from main to dev May 6, 2026 14:46
@pancacake pancacake merged commit d6b8eed into HKUDS:dev May 8, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants