Skip to content

Do not drop mask with SDPA for more cases#30311

Merged
fxmarty merged 4 commits intomainfrom
fix-sdpa-mask-speculative
Apr 18, 2024
Merged

Do not drop mask with SDPA for more cases#30311
fxmarty merged 4 commits intomainfrom
fix-sdpa-mask-speculative

Conversation

@fxmarty
Copy link
Contributor

@fxmarty fxmarty commented Apr 18, 2024

As per title. Overlooked in #30127 (issue with speculative decoding, where no attn_mask is passed to the main model).

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@fxmarty fxmarty requested a review from ArthurZucker April 18, 2024 09:31
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine, but the code is too verbose, while being super hard to understand.
There are 2 possible outcomes. Either we ignore or not.
Returning the value earlier here would help tremendously.
Could you pin point which part was not taken into account for speculative decoding? qlen==1 + qlen == klen?

@fxmarty
Copy link
Contributor Author

fxmarty commented Apr 18, 2024

@ArthurZucker will be cleaner once #30070 is brought back

@ArthurZucker
Copy link
Collaborator

I 'll be harsher to make it clearer!

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merci 🤗

@fxmarty fxmarty merged commit 63c5e27 into main Apr 18, 2024
@fxmarty fxmarty deleted the fix-sdpa-mask-speculative branch April 18, 2024 12:37
ydshieh pushed a commit that referenced this pull request Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants