[benchmarks] Add scalar loss as model output when training#158074
[benchmarks] Add scalar loss as model output when training#158074benjaminglass1 wants to merge 11 commits intogh/benjaminglass1/93/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/158074
Note: Links to docs will display an error until the docs builds have been completed. ❌ 6 New Failures, 1 Unrelated FailureAs of commit 1ec9923 with merge base 1abff80 ( NEW FAILURES - The following jobs have failed:
UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@anijain2305 , can you help to review this dynamo change? |
|
@anijain2305 I ended up needing to add one new model XFAIL to this one, for |
|
ROCm test failure appears unrelated to this PR. |
|
Closing this, as we've decided not to pursue the AOTInductor training work at this time. |
Stack from ghstack (oldest at bottom):
Add a hook to benchmark model forward passes that calculates a scalar loss as the first output when training (and detaches all other outputs). This is a requirement to use joint graph export (experimental), but can be done without loss of generality.
Additionally ensures that Dynamo traces through the loss calculation function (not done previously), which reduces the number of graph breaks in models when training.
cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang @naromero77amd @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames