Add `average_log_prob` args for cpo by Mecoli1219 · Pull Request #510 · linkedin/Liger-Kernel

Mecoli1219 · 2025-01-03T17:22:23Z

Summary

trl CPO implementation didn't average the log_probs, while the liger kernel averages it when computing the loss. This will cause a mismatch when integrating them.

Testing Done

Updating unit test (still investigating why unit test fail locally)

Hardware Type:
run make test to ensure correctness
run make checkstyle to ensure code style
run make test-convergence to ensure convergence

Signed-off-by: Mecoli1219 <michaellai901026@gmail.com>

kashif · 2025-01-03T20:28:01Z

TRL is using the default as in the official repo for CPO: https://github.com/fe1ixxu/CPO_SIMPO/blob/main/scripts/cpo_trainer.py#L626

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>

Signed-off-by: Mecoli1219 <michaellai901026@gmail.com>

austin362667 · 2025-01-08T06:06:56Z

    "scalar, dtype, atol, rtol",
    [
-        (1.0, torch.bfloat16, 5e-3, 5e-3),
+        (1.0, torch.bfloat16, 5e-2, 5e-2),


What's the reasoning behind this adjustment?

@kashif and I find that after disabling average_log_prob for CPO, it will have a higher deviation from HF implementation when the model is large and the data type is bf16. Since the result is still close within both methods, we increase atol and rtol to make this test pass.

as bfloat16 is less accurate for larger numbers, this is needed to make the test pass and is the same as in the other bfloat16 tests

Then adjusting tol makes sense. ❤️

austin362667

Thank you both for making this PR. Hopefully, it unblocks huggingface/trl#2506.

kashif · 2025-01-08T08:11:33Z

awesome thank you! we would still need a release of liger-kernel for the CI to pass but yes it will hopefully unblock!

Mecoli1219 added 3 commits January 3, 2025 09:04

Add average_log_prob args for cpo

09f635b

Signed-off-by: Mecoli1219 <michaellai901026@gmail.com>

updat unit test

b66b925

Signed-off-by: Mecoli1219 <michaellai901026@gmail.com>

update simpo unit test

89e2924

Signed-off-by: Mecoli1219 <michaellai901026@gmail.com>

kashif mentioned this pull request Jan 3, 2025

[Liger] Integrate Liger CPO & SimPO huggingface/trl#2506

Closed

6 tasks

kashif reviewed Jan 3, 2025

View reviewed changes

Comment thread src/liger_kernel/chunked_loss/fused_linear_preference.py Outdated

Mecoli1219 and others added 3 commits January 4, 2025 05:56

Update src/liger_kernel/chunked_loss/fused_linear_preference.py

d82e918

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>

increase atol/rtol for cpo

64ceead

Signed-off-by: Mecoli1219 <michaellai901026@gmail.com>

Merge branch 'main' into cpo-average-log-prob

59ebcc8

austin362667 reviewed Jan 8, 2025

View reviewed changes

Merge branch 'main' into cpo-average-log-prob

6a97713

austin362667 approved these changes Jan 8, 2025

View reviewed changes

austin362667 merged commit 23e3772 into linkedin:main Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `average_log_prob` args for cpo#510

Add `average_log_prob` args for cpo#510
austin362667 merged 7 commits into
linkedin:mainfrom
Mecoli1219:cpo-average-log-prob

Mecoli1219 commented Jan 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

kashif commented Jan 3, 2025 •

edited

Loading

Uh oh!

austin362667 Jan 8, 2025 •

edited

Loading

Uh oh!

Mecoli1219 Jan 8, 2025

Uh oh!

kashif Jan 8, 2025

Uh oh!

austin362667 Jan 8, 2025

Uh oh!

austin362667 left a comment

Uh oh!

kashif commented Jan 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Mecoli1219 commented Jan 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing Done

Uh oh!

Uh oh!

kashif commented Jan 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

austin362667 Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mecoli1219 Jan 8, 2025

Choose a reason for hiding this comment

Uh oh!

kashif Jan 8, 2025

Choose a reason for hiding this comment

Uh oh!

austin362667 Jan 8, 2025

Choose a reason for hiding this comment

Uh oh!

austin362667 left a comment

Choose a reason for hiding this comment

Uh oh!

kashif commented Jan 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Mecoli1219 commented Jan 3, 2025 •

edited

Loading

kashif commented Jan 3, 2025 •

edited

Loading

austin362667 Jan 8, 2025 •

edited

Loading