Revert "mul: remove opmath cast sequence (pytorch#9663)"#9701
Revert "mul: remove opmath cast sequence (pytorch#9663)"#9701jeffhataws merged 3 commits intopytorch:r2.9from
Conversation
|
Is the torchax test failure expected? |
ysiraichi
left a comment
There was a problem hiding this comment.
- The TorchAX CI could be fixed by this change
- Instead of reverting the commit, I think it's better to make it backend specific, and land it on
masterand cherry-pick it tor2.9?
What do you think?
|
While I understand the suggestion from #8545 to make this backend-specific, I believe a full revert is more appropriate atleast for release branch r2.9:
Let me know your thoughts? |
|
Thank you for your detailed analysis. I'm sorry, but I still think that the best way to go about this is to make it backend specific. Actually, now that I'm thinking about it, I think it would make more sense to do it in the lowering step. That said, since we don't really want to introduce a lot of changes here, and the infrastructure is already in place, I would say that we should make that It's a simple change (something like this condition for TPU) that fixes this problem for the Neuron backend, while leaving the other backends with the "backend-independent" choice of not doing that.
That "seeming unnecessary upcast/downcast" is exactly what we are talking about, here. And, no, PyTorch autocast does not solve this issue (see this comment). |
9eb03b7 to
ddfe243
Compare
ysiraichi
left a comment
There was a problem hiding this comment.
Thank you for the PR.
Let me know if you have any questions.
|
Could you rebase this PR, so that there are only your commits? |
50f33f2 to
214f075
Compare
updated |
ysiraichi
left a comment
There was a problem hiding this comment.
LGTM.
Let's just wait for the CI.
|
@ysiraichi CI is completed. |
|
Thanks @rajkthakur @ysiraichi . It is merged now and ready to go. @bhavya01 will make a final 2.9 release candidate. |
Commit 2a9138a removed
.use_opmathtype_for_compute()from element-wise 'mul' operation, this breaks mixed-precision accumulation behavior expected by the Neuron compiler that traces/compile on CPU and later execute the binary on neuron hardwares, causing significant accuracy degradation in:Reverts: commit 2a9138a, other changes are result of rebase from r2.9
Fixes: Model accuracy failures with mixed-precision accumulation #9699