Skip to content

Feature Request: Save/Load Precomputed Ref Log-Probabilities in DPOTrainer #3985

@ginkyenglee

Description

@ginkyenglee

Feature request

Save/Load Precomputed Ref Log-Probabilities in DPOTrainer

Motivation

Currently, when precompute_ref_log_probs=True, DPOTrainer always recomputes the reference model log-probs (ref_chosen_logps, ref_rejected_logps) for the training and evaluation datasets.
This process can be very time-consuming, and for repeated experiments on the same dataset/model setup, it leads to redundant computation.

It would be highly beneficial to have an option to cache precomputed reference log-probs to disk and reload them later, avoiding unnecessary recomputation.

Your contribution

Introduce two new arguments:

  • save_ref_logps_dir: (optional) directory path where precomputed log-probs will be stored
  • load_ref_logps_dir: (optional) directory path to load precomputed log-probs from
    • When provided, the trainer checks dataset fingerprint, number of rows, and model/tokenizer info to ensure compatibility
    • If they match, cached values are loaded instead of recomputing
    • If they differ, print log and the cache is ignored and recomputation proceeds as usual

If the idea sounds reasonable, I’d be happy to open a PR to implement it. Any feedback or suggestions would be greatly appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions