Loss function comparison (CE vs P2 variants) under parameter-golf constraints#1180
Loss function comparison (CE vs P2 variants) under parameter-golf constraints#1180estesryan wants to merge 1 commit intoopenai:mainfrom
Conversation
|
I tested the P2 loss The concern: The P2 loss reweights the cross-entropy during training, focusing gradients on hard tokens. This is a legitimate training technique. However, if the same P2-weighted loss is also used during evaluation (via Per the README and Issue #1017:
This requires standard cross-entropy evaluation (Condition 2: full normalized distribution scored with My test results:
When evaluated with standard The reported 1.0577 BPP may be using the P2-weighted loss path for evaluation rather than standard cross-entropy. Could you confirm whether cc @0hq @valerio-oai |
|
Thanks for the careful review. You are correct. Validation is currently using the P2-weighted loss via model.forward(). I will update the evaluation to use standard cross-entropy (unweighted), rerun, and resubmit with corrected metrics. Appreciate you flagging this. |
Submission: SR-CM-P2Loss
Key features:
Final:
Included: