Question about apply_chat_template in examples

When I looked at the examples I found that the example script for DPO uses `apply_chat_template` for `chosen` and `rejected` but not for `prompt`.
https://github.com/huggingface/trl/blob/d1ed730ab8281b1b0c78d7d61bc0f6603a9ce958/examples/scripts/dpo.py#L150-L152
And it seems that `chosen` is a complete conversation. 
```
[ { "content": "Hi, I want to learn to play horseshoes. Can you teach me?", "role": "user" }, { "content": "I can, but maybe I should begin by telling you that a typical game consists of 2 players and 6 or 8 horseshoes.", "role": "assistant" }, { "content": "Okay. What else is needed to play, and what are the rules?", "role": "user" }, { "content": "A horseshoe is usually made out of metal and is about 3 to 3.5 inches long and around 1 inch thick. The horseshoe should also have a 2 inch by 3 inch flat at the bottom where the rubber meets the metal. We also need two stakes and six horseshoes.", "role": "assistant" } ]
```
I think that using chat_template for the input prompt and only remaining the `assistant` output as `chosen`/`rejected` will be consistent with the inference phase. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about apply_chat_template in examples #1752

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

	def process(row):
	row["chosen"] = tokenizer.apply_chat_template(row["chosen"], tokenize=False)
	row["rejected"] = tokenizer.apply_chat_template(row["rejected"], tokenize=False)

Question about apply_chat_template in examples #1752

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions