fix(deepseek): bump V4 family context window to 1M tokens by zkl2333 · Pull Request #14952 · NousResearch/hermes-agent

zkl2333 · 2026-04-24T06:49:36Z

Summary

Follow-up to #14934, which added deepseek-v4-pro / deepseek-v4-flash to the DeepSeek native provider's model list. The context-window lookup in agent/model_metadata.py still falls back to the existing "deepseek" substring entry (128K) — but DeepSeek V4 ships with a 1M context window, so callers relying on get_model_context_length() for pre-flight token budgeting (compression triggers, context warnings) under-count by ~8x.

Adds explicit lowercase entries for the four DeepSeek model ids that share the 1M window:

deepseek-v4-pro — 1M
deepseek-v4-flash — 1M
deepseek-chat — 1M (legacy alias, server-side maps to v4-flash non-thinking)
deepseek-reasoner — 1M (legacy alias, server-side maps to v4-flash thinking)

Longest-key-first substring matching means these explicit entries also resolve the vendor-prefixed forms (deepseek/deepseek-v4-pro on OpenRouter / Nous Portal) without regressing the existing 128K fallback for older / unknown DeepSeek model ids on custom endpoints.

Source: https://api-docs.deepseek.com/quick_start/pricing

Pairs with #14946 (the normalization-side fix). The two PRs are independent — either one can land first.

Note for reviewers (out of scope, just flagging)

The pre-existing "deepseek-ai/DeepSeek-V3.2": 65536 override at model_metadata.py:185 is effectively dead — its key contains uppercase letters but the lookup at line 1290 checks if default_model in model_lower against a lowercased input. Not addressed in this PR to keep the diff scoped, happy to follow up separately if useful.

Test plan

pytest tests/agent/test_model_metadata.py::TestDefaultContextLengths — 9 passed
New test test_deepseek_v4_models_1m_context covers bare ids, vendor-prefixed forms, and legacy aliases
End-to-end resolution (fallback table isolated via mocks): get_model_context_length() returns 1,000,000 for all of deepseek-v4-pro, deepseek-v4-flash, deepseek/deepseek-v4-pro, deepseek/deepseek-v4-flash, deepseek-chat, deepseek-reasoner; the deepseek substring fallback stays at 128K (no regression for unknown deepseek-* ids on custom endpoints)

NousResearch#14934 added deepseek-v4-pro / deepseek-v4-flash to the DeepSeek native provider but the context-window lookup still falls back to the existing "deepseek" substring entry (128K). DeepSeek V4 ships with a 1M context window, so any caller relying on get_model_context_length() for pre-flight token budgeting (compression, context warnings) under-counts by ~8x. Add explicit lowercase entries for the four DeepSeek model ids that ship 1M context: - deepseek-v4-pro - deepseek-v4-flash - deepseek-chat (legacy alias, server-side maps to v4-flash non-thinking) - deepseek-reasoner (legacy alias, server-side maps to v4-flash thinking) Longest-key-first substring matching means these explicit entries also cover the vendor-prefixed forms (deepseek/deepseek-v4-pro on OpenRouter and Nous Portal) without regressing the existing 128K fallback for older / unknown DeepSeek model ids on custom endpoints. Source: https://api-docs.deepseek.com/zh-cn/quick_start/pricing

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder provider/deepseek DeepSeek API labels Apr 24, 2026

zkl2333 mentioned this pull request Apr 24, 2026

feat(deepseek): plumb reasoning_effort and thinking toggle to V4 API #14958

Closed

2 tasks

Tranquil-Flow mentioned this pull request Apr 25, 2026

fix(agent): comprehensive DeepSeek V4 support — context windows, thinking mode, reasoning replay #15446

Closed

6 tasks

Alirezajalilii mentioned this pull request Apr 26, 2026

Auxiliary compression model auto-detected as DeepSeek V4 Flash gets wrong 128K context instead of 1M (affects compression feasibility check) #15983

Closed

teknium1 merged commit 2ccdadc into NousResearch:main Apr 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(deepseek): bump V4 family context window to 1M tokens#14952

fix(deepseek): bump V4 family context window to 1M tokens#14952
teknium1 merged 1 commit into
NousResearch:mainfrom
zkl2333:fix/deepseek-v4-context-window

zkl2333 commented Apr 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zkl2333 commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Note for reviewers (out of scope, just flagging)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zkl2333 commented Apr 24, 2026 •

edited

Loading