Fix CPU offload + disk offload tests#27204
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
e7651c9 to
fed5e54
Compare
|
@amyeroberts @patrickvonplaten if you feel uneasy with merging this right before the release, I'm fine with reverting the safetensors serialization by default to let it sit on |
amyeroberts
left a comment
There was a problem hiding this comment.
Thanks for finding the fix so quickly!
|
@LysandreJik The change LGTM and seems to address some underlying issues. Re default safetensors serialization, I'm happy for it to be part of this release as long as some of the slow tests on the most popular models (bert, llama, wav2vec2, whisper, clip etc.) are good. |
| # Initialize weights and apply final processing | ||
| self.post_init() | ||
|
|
||
| def _tie_weights(self): |
There was a problem hiding this comment.
Ah nice, noticed this as well actually 😅
https://github.com/huggingface/transformers/pull/27203/files#r1378948159
| new_device_map = {} | ||
| for module, device in device_map.items(): | ||
| new_device_map.update({p: device for p in param_names if p == module or p.startswith(f"{module}.")}) | ||
| new_device_map.update( |
|
Thanks both for your reviews! I'll go ahead and merge this, sorry but you'll have the conflict Patrick 😁 |
Fix disk offload tests + weight sharing issues
Passing to safetensors serialization by default highlighted a few issues that we have with safetensors.
This PR fixes the issue, which is principally linked to weight sharing.