Skip to content

Add aux_outputs for CPO and SimPO#492

Merged
austin362667 merged 6 commits into
linkedin:mainfrom
Mecoli1219:cpo-add-aux_outputs
Jan 8, 2025
Merged

Add aux_outputs for CPO and SimPO#492
austin362667 merged 6 commits into
linkedin:mainfrom
Mecoli1219:cpo-add-aux_outputs

Conversation

@Mecoli1219

@Mecoli1219 Mecoli1219 commented Dec 20, 2024

Copy link
Copy Markdown
Collaborator

Summary

In trl implementation, CPO should have 2 extra return values (chosen_rewards, rejected_rewards), but this is not implemented in Liger-kernel.

Testing Done

  • Hardware Type:
  • run make test to ensure correctness
  • run make checkstyle to ensure code style
  • run make test-convergence to ensure convergence

Signed-off-by: Mecoli1219 <michaellai901026@gmail.com>
Signed-off-by: Mecoli1219 <michaellai901026@gmail.com>
Signed-off-by: Mecoli1219 <michaellai901026@gmail.com>
Signed-off-by: Mecoli1219 <michaellai901026@gmail.com>
Signed-off-by: Mecoli1219 <michaellai901026@gmail.com>

@austin362667 austin362667 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you @Mecoli1219

@austin362667 austin362667 merged commit f8bb86f into linkedin:main Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants