Skip to content

Support DeepSeek v3.2#963

Open
zianglih wants to merge 18 commits intoradixark:mainfrom
zianglih:test-v32
Open

Support DeepSeek v3.2#963
zianglih wants to merge 18 commits intoradixark:mainfrom
zianglih:test-v32

Conversation

@zianglih
Copy link
Copy Markdown
Contributor

@zianglih zianglih commented Apr 9, 2026

@HumansAnd

  • Add V3.2 model configs and launch script
  • Reuse existing GLM-5 code path
  • Add support for V3.2 non-interleaved indexer format
    • Patch indexer_rope_interleave into args
  • Official V3.2 does not use chat template, add a fallback
    • Patch model type into tokenizer
  • Add a --freeze-indexer arg

@zianglih zianglih marked this pull request as ready for review April 9, 2026 20:17
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for DeepSeek-V3.2 models, including necessary configuration patching, tokenizer handling, and adjustments to RoPE interleave logic in the Megatron-to-HF conversion and model plugins. Review feedback highlights the need to replace hardcoded model dimensions (e.g., 128, 64) with dynamic configuration values and to use field(default_factory=...) for dataclass default values to ensure proper execution.

Comment thread miles/backends/megatron_utils/megatron_to_hf/deepseekv3.py
Comment thread miles/backends/megatron_utils/megatron_to_hf/deepseekv3.py
Comment thread miles/backends/megatron_utils/megatron_to_hf/deepseekv3.py
Comment thread miles/backends/megatron_utils/megatron_to_hf/deepseekv3.py
Comment thread scripts/run_deepseek_v32.py
ziang-and pushed a commit to zianglih/miles that referenced this pull request Apr 9, 2026
ziang-and pushed a commit to zianglih/miles that referenced this pull request Apr 10, 2026
ziang-and pushed a commit to zianglih/miles that referenced this pull request Apr 11, 2026
"You are using too many GPUs for this conversion."
assert args.pipeline_model_parallel_size <= args.num_layers, (
f"Pipeline model parallel size {args.pipeline_model_parallel_size} must be less than or equal to "
f"number of layers {args.num_layers}."
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to relax the assertion to unblock conversion.

ziang-and pushed a commit to zianglih/miles that referenced this pull request Apr 19, 2026
@ziang-and ziang-and requested a review from Zhichenzzz as a code owner April 29, 2026 19:00
@Zhichenzzz
Copy link
Copy Markdown
Contributor

Thanks @zianglih! Could you share which docker image and sgLang commit you used for testing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants