Skip to content

fix(deepseek): bump V4 family context window to 1M tokens#14952

Merged
teknium1 merged 1 commit into
NousResearch:mainfrom
zkl2333:fix/deepseek-v4-context-window
Apr 26, 2026
Merged

fix(deepseek): bump V4 family context window to 1M tokens#14952
teknium1 merged 1 commit into
NousResearch:mainfrom
zkl2333:fix/deepseek-v4-context-window

Conversation

@zkl2333

@zkl2333 zkl2333 commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

Summary

Follow-up to #14934, which added deepseek-v4-pro / deepseek-v4-flash to the DeepSeek native provider's model list. The context-window lookup in agent/model_metadata.py still falls back to the existing "deepseek" substring entry (128K) — but DeepSeek V4 ships with a 1M context window, so callers relying on get_model_context_length() for pre-flight token budgeting (compression triggers, context warnings) under-count by ~8x.

Adds explicit lowercase entries for the four DeepSeek model ids that share the 1M window:

  • deepseek-v4-pro — 1M
  • deepseek-v4-flash — 1M
  • deepseek-chat — 1M (legacy alias, server-side maps to v4-flash non-thinking)
  • deepseek-reasoner — 1M (legacy alias, server-side maps to v4-flash thinking)

Longest-key-first substring matching means these explicit entries also resolve the vendor-prefixed forms (deepseek/deepseek-v4-pro on OpenRouter / Nous Portal) without regressing the existing 128K fallback for older / unknown DeepSeek model ids on custom endpoints.

Source: https://api-docs.deepseek.com/quick_start/pricing

Pairs with #14946 (the normalization-side fix). The two PRs are independent — either one can land first.

Note for reviewers (out of scope, just flagging)

The pre-existing "deepseek-ai/DeepSeek-V3.2": 65536 override at model_metadata.py:185 is effectively dead — its key contains uppercase letters but the lookup at line 1290 checks if default_model in model_lower against a lowercased input. Not addressed in this PR to keep the diff scoped, happy to follow up separately if useful.

Test plan

  • pytest tests/agent/test_model_metadata.py::TestDefaultContextLengths — 9 passed
  • New test test_deepseek_v4_models_1m_context covers bare ids, vendor-prefixed forms, and legacy aliases
  • End-to-end resolution (fallback table isolated via mocks): get_model_context_length() returns 1,000,000 for all of deepseek-v4-pro, deepseek-v4-flash, deepseek/deepseek-v4-pro, deepseek/deepseek-v4-flash, deepseek-chat, deepseek-reasoner; the deepseek substring fallback stays at 128K (no regression for unknown deepseek-* ids on custom endpoints)

NousResearch#14934 added deepseek-v4-pro / deepseek-v4-flash to the DeepSeek native
provider but the context-window lookup still falls back to the existing
"deepseek" substring entry (128K). DeepSeek V4 ships with a 1M context
window, so any caller relying on get_model_context_length() for
pre-flight token budgeting (compression, context warnings) under-counts
by ~8x.

Add explicit lowercase entries for the four DeepSeek model ids that
ship 1M context:

- deepseek-v4-pro
- deepseek-v4-flash
- deepseek-chat (legacy alias, server-side maps to v4-flash non-thinking)
- deepseek-reasoner (legacy alias, server-side maps to v4-flash thinking)

Longest-key-first substring matching means these explicit entries also
cover the vendor-prefixed forms (deepseek/deepseek-v4-pro on OpenRouter
and Nous Portal) without regressing the existing 128K fallback for
older / unknown DeepSeek model ids on custom endpoints.

Source: https://api-docs.deepseek.com/zh-cn/quick_start/pricing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists provider/deepseek DeepSeek API type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants