Feature request
Save/Load Precomputed Ref Log-Probabilities in DPOTrainer
Motivation
Currently, when precompute_ref_log_probs=True, DPOTrainer always recomputes the reference model log-probs (ref_chosen_logps, ref_rejected_logps) for the training and evaluation datasets.
This process can be very time-consuming, and for repeated experiments on the same dataset/model setup, it leads to redundant computation.
It would be highly beneficial to have an option to cache precomputed reference log-probs to disk and reload them later, avoiding unnecessary recomputation.
Your contribution
Introduce two new arguments:
save_ref_logps_dir: (optional) directory path where precomputed log-probs will be stored
load_ref_logps_dir: (optional) directory path to load precomputed log-probs from
- When provided, the trainer checks dataset fingerprint, number of rows, and model/tokenizer info to ensure compatibility
- If they match, cached values are loaded instead of recomputing
- If they differ, print log and the cache is ignored and recomputation proceeds as usual
If the idea sounds reasonable, I’d be happy to open a PR to implement it. Any feedback or suggestions would be greatly appreciated!
Feature request
Save/Load Precomputed Ref Log-Probabilities in DPOTrainer
Motivation
Currently, when
precompute_ref_log_probs=True,DPOTraineralways recomputes the reference model log-probs (ref_chosen_logps, ref_rejected_logps) for the training and evaluation datasets.This process can be very time-consuming, and for repeated experiments on the same dataset/model setup, it leads to redundant computation.
It would be highly beneficial to have an option to cache precomputed reference log-probs to disk and reload them later, avoiding unnecessary recomputation.
Your contribution
Introduce two new arguments:
save_ref_logps_dir: (optional) directory path where precomputed log-probs will be storedload_ref_logps_dir: (optional) directory path to load precomputed log-probs fromIf the idea sounds reasonable, I’d be happy to open a PR to implement it. Any feedback or suggestions would be greatly appreciated!