[ORPO] fix orpo chosen-nll loss by kashif · Pull Request #2502 · huggingface/trl

kashif · 2024-12-19T10:09:22Z

What does this PR do?

Calculate the ORPO chosen nll loss with respect to the chosen completion only rather than the whole prompt+compeletion.

Also return the shifted logits when the model is decoder only

HuggingFaceDocBuilderDev · 2024-12-19T10:13:19Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

qgallouedec · 2024-12-19T10:32:23Z

-            attention_mask = concatenated_batch["concatenated_attention_mask"]
-            labels = torch.where(attention_mask == 1, labels, self.label_pad_token_id)
-
+        labels = concatenated_batch["concatenated_labels"].clone()


Yes, checked together, if you do

labels = concatenated_batch["concatenated_input_ids"].clone() attention_mask = concatenated_batch["concatenated_attention_mask"] labels = torch.where(attention_mask == 1, labels, self.label_pad_token_id)

you don't ignore the prompt.

fix orpo chosen-nll loss

495bcac

kashif requested a review from qgallouedec December 19, 2024 10:09

kashif mentioned this pull request Dec 19, 2024

🐯 [Liger] add native liger-kernel ORPO loss #2482

Closed

qgallouedec reviewed Dec 19, 2024

View reviewed changes

qgallouedec approved these changes Dec 19, 2024

View reviewed changes

kashif merged commit 88ad1a0 into main Dec 19, 2024

kashif deleted the orpo-nll-fix branch December 19, 2024 10:33

kashif mentioned this pull request Dec 28, 2024

↩️ Revert ORPO loss changes #2527

Merged

yxliu-TAMU pushed a commit to mincheolseong/ECEN743-GRPO-Project-Proposal that referenced this pull request Apr 20, 2025

fix orpo chosen-nll loss (huggingface#2502)

c12d133

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ORPO] fix orpo chosen-nll loss#2502

[ORPO] fix orpo chosen-nll loss#2502
kashif merged 1 commit into
mainfrom
orpo-nll-fix

kashif commented Dec 19, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Dec 19, 2024

Uh oh!

qgallouedec Dec 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kashif commented Dec 19, 2024

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Dec 19, 2024

Uh oh!

qgallouedec Dec 19, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants