fix a typo in flax T5 attention - attention_mask variable is misnamed#26663
fix a typo in flax T5 attention - attention_mask variable is misnamed#26663ArthurZucker merged 2 commits intohuggingface:mainfrom
Conversation
sanchit-gandhi
left a comment
There was a problem hiding this comment.
Very nice @giganttheo! Thanks for identifying the bug and proposing the fix 🤗 Confirming that the slow tests pass following the fix? As per #26564 (comment) If so, then this all LGTM!
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
The slow tests are passing for t5 and longt5:
outputs: and for the longT5 version:
outputs: |
|
Awesome - thanks for confirming! Requesting a final review from @ArthurZucker |
ArthurZucker
left a comment
There was a problem hiding this comment.
LGTM! Thanks for catching 🤗
What does this PR do?
Fixes a typo in the Flax code for T5 model.
There is a typo in the Attention module of the Flax version of T5, where the attention_mask updated by the
_concatenate_to_cachemethod should override the previous attention_mask but does not because of a misnamed variable.Fixes #26564
Before submitting
Who can review?
@sanchit-gandhi