Skip to content

Change default max_shard_size to smaller value#26942

Merged
younesbelkada merged 5 commits intomainfrom
younesbelkada-change-max-shard-size
Oct 23, 2023
Merged

Change default max_shard_size to smaller value#26942
younesbelkada merged 5 commits intomainfrom
younesbelkada-change-max-shard-size

Conversation

@younesbelkada
Copy link
Contributor

@younesbelkada younesbelkada commented Oct 19, 2023

What does this PR do?

As per title, we can also change that value dynamically with respect to model size - with this change large models will end up having many shards.

cc @ArthurZucker

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 19, 2023

The documentation is not available anymore as the PR was closed or merged.

The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size
lower than this size. If expressed as a string, needs to be digits followed by a unit (like `"5MB"`).
We default it to 2GB in order for models to be able to run easily on free-tier google colab instances
without CPU OOM issues.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe 5GB would be a better trade off no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah makes sense

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, we can't enforce the default is the same as from pretrained but no big deal

@younesbelkada younesbelkada merged commit 50d0cf4 into main Oct 23, 2023
@younesbelkada younesbelkada deleted the younesbelkada-change-max-shard-size branch October 23, 2023 12:25
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 19, 2023
* Update modeling_utils.py

* fixup

* let's change it to 5GB

* fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants