Unblock Llama2 ONNX export w/ sdpa by falling back to manual impl#28823
Unblock Llama2 ONNX export w/ sdpa by falling back to manual impl#28823BowenBao wants to merge 1 commit intohuggingface:mainfrom
Conversation
thiagocrepaldi
left a comment
There was a problem hiding this comment.
LGTM. Maybe add a unit test for the torch.jit.trace case?
ArthurZucker
left a comment
There was a problem hiding this comment.
Thanks for this! It's gonna be a bit hard to merge this. Would you mind checking if #27931 fixes the issue? It shall be merged before and should simplify all of that logic
|
Hi @ArthurZucker, I have validated the issue is fixed under your PR, thanks! Do you have an ETA when it will get merged? Our workstreams have been blocked by this issue for a while, we need to resolve this export issue asap. |
| output_attentions: bool = False, | ||
| use_cache: bool = False, | ||
| ) -> Tuple[torch.Tensor, Optional[torch.Tensor], Optional[Tuple[torch.Tensor]]]: | ||
| _jit_tracing = torch.jit.is_tracing() |
There was a problem hiding this comment.
This means that we call torch.jit.is_tracing as many times as there are layers.
|
I don't understand why this change is necessary. The error that is normally raised explicitly gives a solution. |
|
@ArthurZucker @BowenBao I believe we can close this issue now that #27931 was merged |
What does this PR do?
Unblocks Llama2 ONNX export with sdpa by falling back to manual implementation.
Fixes #28610
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@fxmarty