Skip to content

feat: add DeepSeek-V3 training chat template with generation markers#5527

Merged
qgallouedec merged 3 commits into
huggingface:mainfrom
RudrenduPaul:feat/deepseek-v3-training-chat-template
Apr 14, 2026
Merged

feat: add DeepSeek-V3 training chat template with generation markers#5527
qgallouedec merged 3 commits into
huggingface:mainfrom
RudrenduPaul:feat/deepseek-v3-training-chat-template

Conversation

@RudrenduPaul

@RudrenduPaul RudrenduPaul commented Apr 12, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Adds a training-compatible chat template for DeepSeek-V3 models (e.g. deepseek-ai/DeepSeek-V3), following the same pattern introduced for LLaMA 3 in #5493, Qwen2.5 in #5522, and GPT-OSS.

Files added:

  • trl/chat_templates/deepseekv3.jinja — exact copy of the official DeepSeek-V3 tokenizer template (sourced from trl-internal-testing/tiny-DeepseekV3ForCausalLM)
  • trl/chat_templates/deepseekv3_training.jinja — training variant with {% generation %} / {% endgeneration %} markers wrapping all assistant output (plain content, tool-call, and post-tool-output branches) for assistant-only loss masking in SFT

Changes to existing files:

  • trl/chat_template_utils.py: loads both templates, registers DeepSeek-V3 in get_training_chat_template(), updates docstring
  • tests/test_chat_template_utils.py: adds trl-internal-testing/tiny-DeepseekV3ForCausalLM to TestGetTrainingChatTemplate

The training template handles all three assistant message cases: (1) plain text response, (2) tool calls with optional content, (3) response after tool outputs — all wrapped in the generation markers.

Closes part of #5471 (tracking issue for training chat templates).

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline?
  • Did you update the documentation with your changes?
  • Did you write any new necessary tests?

AI writing disclosure

I used AI assistance (Claude Code) to draft and implement this change.

Who can review?

@qgallouedec


Note

Medium Risk
Medium risk because it changes template selection and rendering behavior for tool-calling/prefix checks; mistakes could subtly alter prompts or masking during training for DeepSeek-V3 users.

Overview
Adds DeepSeek-V3 support to get_training_chat_template() by introducing deepseekv3.jinja (identity match) and a new deepseekv3_training.jinja variant that wraps assistant output in {% generation %} markers and JSON-serializes tool-call arguments.

Hardens prefix-preservation checks/tests around tool calling by adding a TypeError fallback when templates reject dict arguments, and expands the test matrix to include tiny-DeepseekV3ForCausalLM with a baseline workaround for the unpatched upstream template behavior.

Reviewed by Cursor Bugbot for commit da51df2. Bugbot is set up for automated code reviews on this repo. Configure here.

Add training chat template for DeepSeek-V3 with {% generation %} markers
for SFT assistant-only loss masking. Part of huggingface#5471.

Built by Rudrendu Paul, developed with Claude Code

@qgallouedec qgallouedec left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@qgallouedec

Copy link
Copy Markdown
Member

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@qgallouedec qgallouedec merged commit 9157aa7 into huggingface:main Apr 14, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants