Skip to content

Add tiny Qwen3-4B-Instruct-2507#5586

Merged
qgallouedec merged 11 commits into
mainfrom
qwen3-4b-instruct-2507
Apr 27, 2026
Merged

Add tiny Qwen3-4B-Instruct-2507#5586
qgallouedec merged 11 commits into
mainfrom
qwen3-4b-instruct-2507

Conversation

@qgallouedec

@qgallouedec qgallouedec commented Apr 17, 2026

Copy link
Copy Markdown
Member

#5470 (comment) flagged that Qwen3-4B-Instruct-2507 uses a different chat template than other Qwen3 model.

This PR adds a tiny version of Qwen3 using this other template. Next PR -> #5574


Note

Low Risk
Low risk: this only adds a new tiny test model variant and wires it into existing parameterized test matrices, without changing runtime library behavior.

Overview
Adds a new tiny model fixture for Qwen/Qwen3-4B-Instruct-2507 (published as trl-internal-testing/tiny-Qwen3ForCausalLM-Instruct-2507) so unit tests can cover Qwen3’s alternate non-thinking chat template.

Updates the relevant parameterized test lists (chat template/tool-calling, data utils chat-template application, and chunked LM head coverage) to include the new tiny model ID.

Reviewed by Cursor Bugbot for commit 7b162f9. Bugbot is set up for automated code reviews on this repo. Configure here.

@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@albertvillanova albertvillanova left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@qgallouedec qgallouedec merged commit a7648ba into main Apr 27, 2026
10 of 13 checks passed
@qgallouedec qgallouedec deleted the qwen3-4b-instruct-2507 branch April 27, 2026 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants