Skip to content

[Liger] liger DPO support#2568

Merged
kashif merged 38 commits into
mainfrom
liger-dpo
Jun 12, 2025
Merged

[Liger] liger DPO support#2568
kashif merged 38 commits into
mainfrom
liger-dpo

Conversation

@kashif

@kashif kashif commented Jan 14, 2025

Copy link
Copy Markdown
Collaborator

What does this PR do?

Add support for Liger-kernel losses for the DPO Kernel

Needs: linkedin/Liger-Kernel#521

Peft support: #3065

@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment thread tests/test_dpo_trainer.py Outdated
Comment thread trl/trainer/dpo_trainer.py Outdated
@qgallouedec

Copy link
Copy Markdown
Member

liger loss isn't compatible with ref precomputing right? If so we could add a warning or an error.

Comment thread docs/source/reducing_memory_usage.md
@VProv

VProv commented Mar 26, 2025

Copy link
Copy Markdown

@VProv VProv mentioned this pull request Mar 26, 2025
5 tasks
@kashif

kashif commented Mar 26, 2025

Copy link
Copy Markdown
Collaborator Author

@VProv, at the moment, I was having issues getting the same outputs/metrics with and without liger in the trainer.

@VProv

VProv commented Mar 26, 2025

Copy link
Copy Markdown

@VProv, at the moment, I was having issues getting the same outputs/metrics with and without liger in the trainer.

What setup are you using?

@vaibhavjindal

Copy link
Copy Markdown
Contributor

Hi, I am working on fixing the output/metrics issue.
Added a PR in liger-kernel: linkedin/Liger-Kernel#676

@vaibhavjindal

Copy link
Copy Markdown
Contributor

@kashif @qgallouedec can you please review the following PR which fixes the output/metrics issue? Thanks :)
#3346

@hanbyul-kim

Copy link
Copy Markdown

Continuing my analysis, I can confirm that it's definitely connected to DeepSpeed zero 3. When I switched to stage 2, it ran smoothly without any issues.

@kashif

kashif commented May 5, 2025

Copy link
Copy Markdown
Collaborator Author

thanks @hanbyul-kim for the report

@vaibhavjindal

Copy link
Copy Markdown
Contributor

@kashif just wanted to circle back and see if we can merge this now? We wanted to try it out internally at Linkedin.

Comment thread trl/trainer/dpo_trainer.py Outdated
import wandb


def shift_tokens_right(input_ids: torch.Tensor, pad_token_id: int, decoder_start_token_id: int) -> torch.Tensor:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pad_token_id isn't used?

Comment thread trl/trainer/dpo_trainer.py
Comment thread trl/trainer/dpo_trainer.py


def shift_tokens_right(input_ids: torch.Tensor, decoder_start_token_id: int) -> torch.Tensor:
"""Shift input ids one token to the right, and pad with pad_token_id"""

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this docstring ain't accurate I think

@kashif kashif merged commit 53c4a7c into main Jun 12, 2025
11 checks passed
@kashif kashif deleted the liger-dpo branch June 12, 2025 10:25
LuisVasquezBSC pushed a commit to langtech-bsc/trl that referenced this pull request Aug 28, 2025
Co-authored-by: Quentin Gallouédec <quentin.gallouedec@huggingface.co>
Co-authored-by: Vaibhav Jindal <32337828+vaibhavjindal@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
LuisVasquezBSC pushed a commit to langtech-bsc/trl that referenced this pull request Aug 28, 2025
Co-authored-by: Quentin Gallouédec <quentin.gallouedec@huggingface.co>
Co-authored-by: Vaibhav Jindal <32337828+vaibhavjindal@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <gallouedec.quentin@gmail.com>
@qgallouedec qgallouedec mentioned this pull request Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants