fix(deepseek): bump V4 family context window to 1M tokens#14952
Merged
teknium1 merged 1 commit intoApr 26, 2026
Conversation
NousResearch#14934 added deepseek-v4-pro / deepseek-v4-flash to the DeepSeek native provider but the context-window lookup still falls back to the existing "deepseek" substring entry (128K). DeepSeek V4 ships with a 1M context window, so any caller relying on get_model_context_length() for pre-flight token budgeting (compression, context warnings) under-counts by ~8x. Add explicit lowercase entries for the four DeepSeek model ids that ship 1M context: - deepseek-v4-pro - deepseek-v4-flash - deepseek-chat (legacy alias, server-side maps to v4-flash non-thinking) - deepseek-reasoner (legacy alias, server-side maps to v4-flash thinking) Longest-key-first substring matching means these explicit entries also cover the vendor-prefixed forms (deepseek/deepseek-v4-pro on OpenRouter and Nous Portal) without regressing the existing 128K fallback for older / unknown DeepSeek model ids on custom endpoints. Source: https://api-docs.deepseek.com/zh-cn/quick_start/pricing
2 tasks
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #14934, which added
deepseek-v4-pro/deepseek-v4-flashto the DeepSeek native provider's model list. The context-window lookup inagent/model_metadata.pystill falls back to the existing"deepseek"substring entry (128K) — but DeepSeek V4 ships with a 1M context window, so callers relying onget_model_context_length()for pre-flight token budgeting (compression triggers, context warnings) under-count by ~8x.Adds explicit lowercase entries for the four DeepSeek model ids that share the 1M window:
deepseek-v4-pro— 1Mdeepseek-v4-flash— 1Mdeepseek-chat— 1M (legacy alias, server-side maps to v4-flash non-thinking)deepseek-reasoner— 1M (legacy alias, server-side maps to v4-flash thinking)Longest-key-first substring matching means these explicit entries also resolve the vendor-prefixed forms (
deepseek/deepseek-v4-proon OpenRouter / Nous Portal) without regressing the existing 128K fallback for older / unknown DeepSeek model ids on custom endpoints.Source: https://api-docs.deepseek.com/quick_start/pricing
Pairs with #14946 (the normalization-side fix). The two PRs are independent — either one can land first.
Note for reviewers (out of scope, just flagging)
The pre-existing
"deepseek-ai/DeepSeek-V3.2": 65536override at model_metadata.py:185 is effectively dead — its key contains uppercase letters but the lookup at line 1290 checksif default_model in model_loweragainst a lowercased input. Not addressed in this PR to keep the diff scoped, happy to follow up separately if useful.Test plan
pytest tests/agent/test_model_metadata.py::TestDefaultContextLengths— 9 passedtest_deepseek_v4_models_1m_contextcovers bare ids, vendor-prefixed forms, and legacy aliasesget_model_context_length()returns 1,000,000 for all ofdeepseek-v4-pro,deepseek-v4-flash,deepseek/deepseek-v4-pro,deepseek/deepseek-v4-flash,deepseek-chat,deepseek-reasoner; thedeepseeksubstring fallback stays at 128K (no regression for unknown deepseek-* ids on custom endpoints)