Context
For tool calling to work end-to-end during training, a chat template needs to satisfy several requirements:
- Supports tools: the template can render a full tool-calling conversation (
user → assistant with tool_calls → tool → ...) without error, and the tool content actually appears in the rendered output.
- Prefix-preserving: appending messages doesn't change the rendering of earlier messages. This is required by
_get_tool_suffix_ids, which extracts tool response formatting tokens by comparing tokenizations with and without tool messages appended.
- Response schema: a schema that allows
parse_response to extract tool calls from the raw assistant text. No tokenizer ships with one built-in yet; TRL provides them via add_response_schema for supported templates.
Current status
| Model |
Chat template |
Supports tools |
Prefix preserving |
Response schema |
Next step |
tiny-BloomForCausalLM |
No |
- |
- |
- |
✅ nothing to do |
tiny-Cohere2ForCausalLM |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-CohereForCausalLM |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-DeepseekV3ForCausalLM |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-DeepseekV3ForCausalLM-0528 |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-FalconMambaForCausalLM |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-Gemma2ForCausalLM |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-Gemma3ForConditionalGeneration |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-GemmaForCausalLM |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-Glm4MoeForCausalLM |
Yes |
Yes |
Yes |
#5463 |
✅ nothing to do |
tiny-GPT2LMHeadModel |
No |
- |
- |
- |
✅ nothing to do |
tiny-GPTNeoXForCausalLM |
No |
- |
- |
- |
✅ nothing to do |
tiny-GptOssForCausalLM |
Yes |
Yes |
Yes |
#5464 |
✅ nothing to do |
tiny-Idefics2ForConditionalGeneration |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-Idefics3ForConditionalGeneration |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-LlamaForCausalLM-3 |
Yes |
No |
Yes |
- |
✅ nothing to do |
tiny-LlamaForCausalLM-3.1 |
Yes |
Yes |
Yes |
#5518 |
🚧 add response schema, test, and documentation |
tiny-LlamaForCausalLM-3.2 |
Yes |
Yes |
Yes |
#5518 |
🚧 add response schema, test, and documentation |
tiny-LlavaForConditionalGeneration |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-LlavaNextForConditionalGeneration |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-OPTForCausalLM |
No |
- |
- |
- |
✅ nothing to do |
tiny-Phi3ForCausalLM |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-Qwen2ForCausalLM-2.5 |
Yes |
Yes |
Yes |
No |
🚧 add response schema, test, and documentation |
tiny-Qwen2VLForConditionalGeneration |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-Qwen2_5_VLForConditionalGeneration |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-Qwen3ForCausalLM |
Yes |
Yes |
No |
Yes (TRL) |
✅ nothing to do |
tiny-Qwen3MoeForCausalLM |
Yes |
Yes |
No |
Yes (TRL) |
✅ nothing to do |
tiny-Qwen3VLForConditionalGeneration |
Yes |
Yes |
Yes |
#5469 |
✅ nothing to do |
tiny-Qwen3_5ForConditionalGeneration |
Yes |
Yes |
Yes |
Yes (TRL) |
✅ nothing to do |
tiny-SmolVLMForConditionalGeneration |
Yes |
No |
- |
- |
✅ nothing to do |
tiny-T5ForConditionalGeneration |
No |
- |
- |
- |
✅ nothing to do |
Notes
- DeepseekV3: actually supports tools, but its template concatenates
arguments directly as a string (+ tool['function']['arguments']), so it requires arguments to be a JSON string rather than a dict. This is a template quirk that will need a patch.
- "Supports tools = No" can mean different things:
- Template rejects the role sequence entirely (Cohere, FalconMamba, Gemma family)
- Template indexes into content as a list for all roles including tool (Idefics2, Idefics3, LlavaNext, SmolVLM)
- Template silently ignores tool messages — renders without error but tool content is missing from output (Cohere2, Phi3)
- Qwen3ForCausalLM uses
{% generation %} / {% endgeneration %} tags (newer transformers feature), while Qwen3MoeForCausalLM uses the older template style. They need separate patches.
- Response schemas are not built-in to any tokenizer yet. TRL provides them via
add_response_schema() for Qwen3 (matches Qwen3Moe) and Qwen3.5.
EDIT: Qwen2-VL and 2.5-VL actually don't support tool calling
Context
For tool calling to work end-to-end during training, a chat template needs to satisfy several requirements:
user → assistant with tool_calls → tool → ...) without error, and the tool content actually appears in the rendered output._get_tool_suffix_ids, which extracts tool response formatting tokens by comparing tokenizations with and without tool messages appended.parse_responseto extract tool calls from the raw assistant text. No tokenizer ships with one built-in yet; TRL provides them viaadd_response_schemafor supported templates.Current status
tiny-BloomForCausalLMtiny-Cohere2ForCausalLMtiny-CohereForCausalLMtiny-DeepseekV3ForCausalLMtiny-DeepseekV3ForCausalLM-0528tiny-FalconMambaForCausalLMtiny-Gemma2ForCausalLMtiny-Gemma3ForConditionalGenerationtiny-GemmaForCausalLMtiny-Glm4MoeForCausalLMtiny-GPT2LMHeadModeltiny-GPTNeoXForCausalLMtiny-GptOssForCausalLMtiny-Idefics2ForConditionalGenerationtiny-Idefics3ForConditionalGenerationtiny-LlamaForCausalLM-3tiny-LlamaForCausalLM-3.1tiny-LlamaForCausalLM-3.2tiny-LlavaForConditionalGenerationtiny-LlavaNextForConditionalGenerationtiny-OPTForCausalLMtiny-Phi3ForCausalLMtiny-Qwen2ForCausalLM-2.5tiny-Qwen2VLForConditionalGenerationtiny-Qwen2_5_VLForConditionalGenerationtiny-Qwen3ForCausalLMtiny-Qwen3MoeForCausalLMtiny-Qwen3VLForConditionalGenerationtiny-Qwen3_5ForConditionalGenerationtiny-SmolVLMForConditionalGenerationtiny-T5ForConditionalGenerationNotes
argumentsdirectly as a string (+ tool['function']['arguments']), so it requiresargumentsto be a JSON string rather than a dict. This is a template quirk that will need a patch.{% generation %}/{% endgeneration %}tags (newer transformers feature), while Qwen3MoeForCausalLM uses the older template style. They need separate patches.add_response_schema()for Qwen3 (matches Qwen3Moe) and Qwen3.5.EDIT: Qwen2-VL and 2.5-VL actually don't support tool calling