Context
SFT with assistant_only_loss=True requires the chat template to include {% generation %} / {% endgeneration %} markers so that return_assistant_tokens_mask=True can produce correct masks. Very few models ship these markers natively.
TRL should provide training chat templates with these markers for known model families (via get_training_chat_template()), and the SFT trainer auto-applies them when assistant_only_loss=True.
This issue tracks adding training templates for all model families with chat template support.
What needs to happen for each model
- Create
<model>_training.jinja: the original template with {% generation %} / {% endgeneration %} added around assistant output (and prefix-preservation fixes if needed)
- Load it in
chat_template_utils.py and add a branch in get_training_chat_template()
- Add a test verifying text output is identical to the original and masks are correct
Model families
Causal LMs with chat template
VLMs
No chat template (no action needed)
- Bloom, GPT2, GPTNeoX, OPT, T5
Notes
- VLMs currently don't support
assistant_only_loss in SFT (blocked by a separate check). These should still be tracked so templates are ready when support lands.
Context
SFT with
assistant_only_loss=Truerequires the chat template to include{% generation %}/{% endgeneration %}markers so thatreturn_assistant_tokens_mask=Truecan produce correct masks. Very few models ship these markers natively.TRL should provide training chat templates with these markers for known model families (via
get_training_chat_template()), and the SFT trainer auto-applies them whenassistant_only_loss=True.This issue tracks adding training templates for all model families with chat template support.
What needs to happen for each model
<model>_training.jinja: the original template with{% generation %}/{% endgeneration %}added around assistant output (and prefix-preservation fixes if needed)chat_template_utils.pyand add a branch inget_training_chat_template()Model families
Causal LMs with chat template
{% generation %}support to training chat templates #5470{% generation %}markers for Cohere2 chat template #5675{% generation %}markers for training chat template #5519{% generation %}markers for training chat template #5484VLMs
No chat template (no action needed)
Notes
assistant_only_lossin SFT (blocked by a separate check). These should still be tracked so templates are ready when support lands.