Add Qwen3.5 Think/NoThink training chat templates with generation markers by aazizyan · Pull Request #5824 · huggingface/trl

aazizyan · 2026-05-23T14:47:36Z

What does this PR do?

Adds training chat templates for Qwen3.5 (Think and NoThink), wiring both into get_training_chat_template.

The two new templates mirror the three modifications already shipped in qwen3_6_training.jinja: require both <think> and </think> before parsing (instead of only </think>), drop the loop.index0 > ns.last_query_index conditional so the thinking block is always emitted (prefix-preservation), and wrap assistant output with {% generation %} / {% endgeneration %} markers. Think and NoThink variants differ only in the default value of the enable_thinking flag inherited from their respective base templates.

Note: source templates renamed

The source templates are renamed from size-based naming to behavior-based naming, matching the -Think / -NoThink fixture suffixes introduced in #5819 and making the size-default mapping explicit in the README. No behavior change.

qwen3_5_4b_and_above.jinja -> qwen3_5_think.jinja
qwen3_5_2b_and_below.jinja -> qwen3_5_nothink.jinja

Changes

trl/chat_templates/ — two new training templates, each a 3-line diff against its renamed source:

qwen3_5_think_training.jinja
qwen3_5_nothink_training.jinja

trl/chat_template_utils.py:

Module-level template variables: qwen3_5_chat_template_{2b_and_below,4b_and_above} -> qwen3_5_{nothink,think}_chat_template; same renames carried through the add_response_schema elif (response schema is unchanged — both variants still map to qwen3_5_schema).
Two new branches in get_training_chat_template() mapping each source template to its training variant.

tests/test_chat_template_utils.py — two new entries in TestGetTrainingChatTemplate's parametrize block (qwen35-nothink + qwen35-think), each carrying the same transformers>=5.0.0 skipif mark.

Docs — Qwen3.5 added to the supported-families list, combined training-template section added in:

docs/source/chat_templates.md
trl/chat_templates/README.md

Part of #5471

cc: @qgallouedec

Note

Low Risk
Changes are limited to chat-template assets, template selection in chat_template_utils, and tests/docs; no training-loop or model-weight logic.

Overview
Adds Qwen3.5 Think/NoThink training chat templates and wires them into automatic template swapping for SFT (assistant_only_loss) and GRPO (tools).

Reference templates are renamed from size-based to behavior-based names (qwen3_5_think.jinja / qwen3_5_nothink.jinja); chat_template_utils loads and matches those names for add_response_schema and get_training_chat_template (unchanged qwen3_5_schema).

New training patches (qwen3_5_think_training.jinja, qwen3_5_nothink_training.jinja) mirror Qwen3.6: require both redacted_thinking open/close tags before splitting content, always emit the thinking block (prefix-preserving when a tool message follows), and wrap assistant turns in generation markers. Think vs NoThink only differs in the default enable_thinking on the generation prompt.

Tests/docs: TestGetTrainingChatTemplate gains tiny Qwen3.5 Think/NoThink fixtures (transformers ≥ 5.0); supported-family lists updated in docs/source/chat_templates.md and trl/chat_templates/README.md.

^{Reviewed by Cursor Bugbot for commit c50cda2. Bugbot is set up for automated code reviews on this repo. Configure here.}

qgallouedec · 2026-05-23T17:44:05Z

Nice, thanks, let's see if the CI is green

HuggingFaceDocBuilderDev · 2026-05-23T17:46:07Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2026-05-23T18:29:07Z

@codex review

chatgpt-codex-connector · 2026-05-23T18:33:08Z

Codex Review: Didn't find any major issues. More of your lovely PRs please.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…kers

qgallouedec approved these changes May 23, 2026

View reviewed changes

Add Qwen3.5 Think/NoThink training chat templates with generation mar…

c50cda2

…kers

aazizyan force-pushed the qwen3.5-generation-markers branch from 8977b30 to c50cda2 Compare May 25, 2026 14:10

qgallouedec mentioned this pull request May 25, 2026

Tracking: Add {% generation %} chat templates for common model families #5471

Open

24 tasks

qgallouedec merged commit fb9cb79 into huggingface:main May 25, 2026
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Qwen3.5 Think/NoThink training chat templates with generation markers#5824

Add Qwen3.5 Think/NoThink training chat templates with generation markers#5824
qgallouedec merged 1 commit into
huggingface:mainfrom
aazizyan:qwen3.5-generation-markers

aazizyan commented May 23, 2026 •

edited by cursor Bot

Loading

Uh oh!

qgallouedec commented May 23, 2026

Uh oh!

HuggingFaceDocBuilderDev commented May 23, 2026

Uh oh!

qgallouedec commented May 23, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aazizyan commented May 23, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Note: source templates renamed

Changes

Uh oh!

qgallouedec commented May 23, 2026

Uh oh!

HuggingFaceDocBuilderDev commented May 23, 2026

Uh oh!

qgallouedec commented May 23, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aazizyan commented May 23, 2026 •

edited by cursor Bot

Loading