Skip to content

Handle unsharded Llama2 model types in conversion script#27069

Merged
ArthurZucker merged 1 commit intohuggingface:mainfrom
coreyhu:main
Oct 26, 2023
Merged

Handle unsharded Llama2 model types in conversion script#27069
ArthurZucker merged 1 commit intohuggingface:mainfrom
coreyhu:main

Conversation

@coreyhu
Copy link
Contributor

@coreyhu coreyhu commented Oct 25, 2023

What does this PR do?

Fixes the unsharded condition to check previously looked-up num_shards instead of model type. This is important because multiple model types are uncharded (7B and 7BF)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Thanks for opneing a PR. Overall i agree with you, num shards should be more resilient but correct me if I am wrong no checkpoints with 1 shard != 7b and 7b-chatwere released ?

@coreyhu
Copy link
Contributor Author

coreyhu commented Oct 25, 2023

There's currently two options for unsharded model_size, 7B and 7Bf (which I assume represents 7B-chat). While we could alternatively remove the *-Bf options for model_size, it seems it's kept for backwards compatibility (line 23). Instead, this PR allows for both 7B and 7B-chat to be treated as unsharded.

@coreyhu coreyhu requested a review from ArthurZucker October 25, 2023 14:34
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this doesn't change much in the code path so why not

@ArthurZucker ArthurZucker merged commit df2eebf into huggingface:main Oct 26, 2023
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants