[core] fix silent bug keep_in_fp32 modules#26589
[core] fix silent bug keep_in_fp32 modules#26589younesbelkada merged 6 commits intohuggingface:mainfrom
core] fix silent bug keep_in_fp32 modules#26589Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
|
Before merging this PR I want to test it on T5 models and 8bit tests as this might affect them |
| model = InstructBlipForConditionalGeneration.from_pretrained( | ||
| "Salesforce/instructblip-flan-t5-xl", | ||
| torch_dtype=torch.bfloat16, | ||
| low_cpu_mem_usage=True, |
There was a problem hiding this comment.
This is just to make the loading of the model faster
|
Relevant T5 and bnb tests are passing, this PR is ready for review! |
|
... and tested the failing instructblip tests on the latest docker image and they pass with these changes |
|
Thanks for the PR! Do we have a common test that could have failed in this specific instance? If not, would it be possible to work on one? I'm a bit afraid of the repercussions of such a change without a test that ensures the modules that should be kept in fp32 actually are and that those that shouldn't are kept in their original dtype. It is fixing a silent error but also seems like it could break some silent successes |
|
Hi @LysandreJik - OK makes sense, I am happy to work on a common test for that - I'll ping you once this is done |
|
Tests are passing, this is ready for another review! |
LysandreJik
left a comment
There was a problem hiding this comment.
Looks great! Thanks @younesbelkada
What does this PR do?
Same PR as #26484 but without any extra diff
Before this PR we were performing a simple check if module_name in key but that lead to some modules silently converted in fp32.
For example instructblip models got their word_embedding layers converted in fp32 because _keep_in_fp32_modules includes "wo" which is contained in the string word_embedding. The fix is to check if module_name in key.split(".")
I can confirm with this PR the failing instructblip tests now pass