KTO changes to return aux outputs by vaibhavjindal · Pull Request #589 · linkedin/Liger-Kernel

vaibhavjindal · 2025-02-26T22:53:44Z

Summary

This PR introduces the following changes to enable integration with huggingface TRL:

KTO loss can now return the following along with the loss:

chosen_logps
rejected_logps
sum(chosen_logits)
sum(rejected_logits)
chosen_rewards
rejected_rewards

Adds an option to enable/disable log_probs averaging while calculating the loss.

Details

Benchmark results with the new implementation:

Testing Done

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

vaibhavjindal · 2025-02-28T00:01:59Z

@shivam15s @hebiao064 @kashif this PR contains the changes needed for trl integration. PTAL, thanks!

hebiao064 · 2025-02-28T00:38:45Z

Can you run a benchmark to see if the memory/speed still performs at certain level?

vaibhavjindal · 2025-02-28T00:45:05Z

Can you run a benchmark to see if the memory/speed still performs at certain level?

Sure, will add it.

shivam15s · 2025-02-28T23:14:36Z

+        chosen_logits_sum = chosen_logits.nansum()
+        rejected_logits_sum = rejected_logits.nansum()


do you expect these to have nans?

I was a bit unsure about this as TRL is using nansum() for this: https://github.com/huggingface/trl/blob/491921c1a4167e7c84429382470b0bb3158e66b0/trl/trainer/kto_trainer.py#L1271. Thus kept the nansum() to deal with the worst case.

shivam15s

lgtm, lets figure out the speed drop in the next PR

vaibhavjindal added 9 commits February 25, 2025 10:35

initial

333d73d

Additional outputs

3a39209

more progress

c018849

Minor change

a7cbea9

minor

dd8e317

test fixes

c06e00f

tests

f37a1cf

docs

dc1d708

formatting

6e9437f

vaibhavjindal mentioned this pull request Feb 28, 2025

[Liger] Liger KTO support huggingface/trl#2812

Merged

5 tasks

Add benchmark

bdf36ef

shivam15s reviewed Feb 28, 2025

View reviewed changes

Merge branch 'main' into kto_trl

b1877f8

shivam15s approved these changes Feb 28, 2025

View reviewed changes

hebiao064 approved these changes Mar 1, 2025

View reviewed changes

vaibhavjindal merged commit d63b888 into linkedin:main Mar 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KTO changes to return aux outputs#589

KTO changes to return aux outputs#589
vaibhavjindal merged 11 commits into
linkedin:mainfrom
vaibhavjindal:kto_trl

vaibhavjindal commented Feb 26, 2025 •

edited

Loading

Uh oh!

vaibhavjindal commented Feb 28, 2025

Uh oh!

hebiao064 commented Feb 28, 2025

Uh oh!

vaibhavjindal commented Feb 28, 2025

Uh oh!

shivam15s Feb 28, 2025

Uh oh!

vaibhavjindal Feb 28, 2025

Uh oh!

shivam15s left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		chosen_logits_sum = chosen_logits.nansum()
		rejected_logits_sum = rejected_logits.nansum()

Conversation

vaibhavjindal commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Testing Done

Uh oh!

vaibhavjindal commented Feb 28, 2025

Uh oh!

hebiao064 commented Feb 28, 2025

Uh oh!

vaibhavjindal commented Feb 28, 2025

Uh oh!

shivam15s Feb 28, 2025

Choose a reason for hiding this comment

Uh oh!

vaibhavjindal Feb 28, 2025

Choose a reason for hiding this comment

Uh oh!

shivam15s left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vaibhavjindal commented Feb 26, 2025 •

edited

Loading