Skip to content

fix(deepseek): preserve v4 model IDs#14947

Closed
anthonylei wants to merge 1 commit into
NousResearch:mainfrom
anthonylei:fix/deepseek-v4-normalization
Closed

fix(deepseek): preserve v4 model IDs#14947
anthonylei wants to merge 1 commit into
NousResearch:mainfrom
anthonylei:fix/deepseek-v4-normalization

Conversation

@anthonylei

@anthonylei anthonylei commented Apr 24, 2026

Copy link
Copy Markdown

What does this PR do?

Preserves explicit DeepSeek V4 public API model IDs during native DeepSeek model normalization.

Before this change, configuring deepseek-v4-pro or deepseek-v4-flash for the native deepseek provider silently normalized the model to deepseek-chat. That made Hermes appear to work while calling a different model than the user explicitly requested.

This keeps the change narrow: only the DeepSeek accepted-model allowlist is expanded, while legacy aliases and reasoner-style alias normalization keep their existing behavior.

Related Issue

No issue filed.

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • hermes_cli/model_normalize.py
    • Preserve deepseek-v4-flash and deepseek-v4-pro for the native deepseek provider.
    • Keep deepseek-chat and deepseek-reasoner aliases working.
    • Keep reasoner-style aliases such as deepseek-r1, deepseek-reasoning, and deepseek-think mapping to deepseek-reasoner.
    • Update the DeepSeek normalization comments/docstring to reflect the V4 public model IDs.
  • tests/hermes_cli/test_model_normalize.py
    • Add regression tests for bare and deepseek/-prefixed V4 model IDs.
    • Add regression coverage that reasoner-style aliases still map to deepseek-reasoner.

How to Test

  1. Run the focused regression tests:
    scripts/run_tests.sh tests/hermes_cli/test_model_normalize.py::TestDeepSeekV4ModelPreservation -v --tb=short
  2. Run the full model-normalization test file:
    scripts/run_tests.sh tests/hermes_cli/test_model_normalize.py -v --tb=short
  3. Confirm expected behavior, for example:
    normalize_model_for_provider("deepseek-v4-pro", "deepseek") == "deepseek-v4-pro"
    normalize_model_for_provider("deepseek/deepseek-v4-flash", "deepseek") == "deepseek-v4-flash"
    normalize_model_for_provider("deepseek-r1", "deepseek") == "deepseek-reasoner"

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS / Darwin

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

Screenshots / Logs

Focused regression test:

scripts/run_tests.sh tests/hermes_cli/test_model_normalize.py::TestDeepSeekV4ModelPreservation -v --tb=short
10 passed in 0.37s

Full model-normalization test file:

scripts/run_tests.sh tests/hermes_cli/test_model_normalize.py -v --tb=short
65 passed in 0.41s

Independent local review was also requested from Claude Code Opus 4.7, Kimi 2.6 via Kimi Code CLI, and DeepSeek V4 Pro. All approved with no blocking issues.

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/cli CLI entry point, hermes_cli/, setup wizard provider/deepseek DeepSeek API duplicate This issue or pull request already exists labels Apr 24, 2026
@alt-glitch

Copy link
Copy Markdown
Collaborator

Duplicate of #14946 — identical fix (add V4 ids to _DEEPSEEK_CANONICAL_MODELS), same files changed, same root cause.

@teknium1

Copy link
Copy Markdown
Contributor

Thanks for the contribution, @anthonylei! This is a valid bug fix and the diagnosis is correct.

Unfortunately this is a duplicate — the identical change (adding deepseek-v4-pro/deepseek-v4-flash to _DEEPSEEK_CANONICAL_MODELS plus a V-series regex passthrough) was already merged to main as commit 4ac731c84 from PR #15119, as noted by @alt-glitch at triage.

  • hermes_cli/model_normalize.py line 131–135: _DEEPSEEK_CANONICAL_MODELS already includes deepseek-v4-pro and deepseek-v4-flash
  • hermes_cli/model_normalize.py line 144: _DEEPSEEK_V_SERIES_RE regex covers future V-series and dated variants
  • Merged via commit 4ac731c84 (fix(model-normalize): pass DeepSeek V-series IDs through instead of folding to deepseek-chat)

Closing as implemented on main. This is an automated hermes-sweeper review.

@teknium1 teknium1 closed this Apr 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/cli CLI entry point, hermes_cli/, setup wizard duplicate This issue or pull request already exists P2 Medium — degraded but workaround exists provider/deepseek DeepSeek API type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants