added unsqueeze_dim to apply_rotary_pos_emb#27117
added unsqueeze_dim to apply_rotary_pos_emb#27117amyeroberts merged 14 commits intohuggingface:mainfrom
Conversation
There was a problem hiding this comment.
Thank you for opening the PR @ShashankMosaicML! And nice touch, the docstring :)
CI is failing because of the # Copied from ... statement, which is used by our internal tools and CI to maintain one file per model policy while enabling the propagation of bugfixes and upgrades.
In this particular case, there is also a bug on our end -- recently we've made Llama the base implementation, from which others are copied from. As such, you'll have to:
- Remove the
# Copied from ...statement in L194 - Run make
fix-copiesin your terminal - Push the changes
You may also need to run make fixup afterwards, which applies automatic code formatting.
|
Thank you @gante for reviewing and providing me the steps to fix the errors! However I still see some errors after following the steps, and on clicking the details for the error, I see the following. Could you please let me know how to fix this error? Thank you! |
amyeroberts
left a comment
There was a problem hiding this comment.
Thanks for working on this!
Change to the function LGTM - there's just some addition changes in this diff to docstrings which need to be removed before merging.
Accepting the proposed changes in formatting. Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
|
Hi @amyeroberts , thanks for suggesting the changes. I have incorporated those, but some quality checks are still failing. Could you take a look? |
|
@ShashankMosaicML running |
|
@gante , I think that worked! (I wasn't running |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
amyeroberts
left a comment
There was a problem hiding this comment.
Thanks for iterating!
* added unsqueeze_dim to apply_rotary_pos_emb * Added docstring * Modified docstring * Modified docstring * Modified docstring * Modified docstring * Modified docstring * ran make fix-copies and make fixup * Update src/transformers/models/llama/modeling_llama.py Accepting the proposed changes in formatting. Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * incorporating PR suggestions * incorporating PR suggestions * incorporating PR suggestions * incorporating PR suggestions * .. --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>


Making the unsqueeze dimension parameterized in the apply_rotary_pos_emb function in modeling_llama.py
This PR introduces a new parameter to the apply_rotary_pos_emb function which allows the specification of the dimension along which to unsqueeze to make the cosine and sine rotary tensors broadcastable to the query and key tensors. This will make the function compatible with codebases that have different shapes for the query and key tensors without needing any back-and-forth transposing.
Fixes #26948
Before submitting
to it if that's the case. Link to the issue
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@gante , @ArthurZucker