Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
| inputs_embeds = self.language_model.get_input_embeddings()(input_ids) | ||
| if image_patches is not None and past_key_values is None: | ||
| patch_embeddings = self.vision_embed_tokens(image_patches.to(self.vision_embed_tokens.weight.dtype)) | ||
| patch_embeddings = self.vision_embed_tokens( |
There was a problem hiding this comment.
Let’s try to split in two lines. Also if we can register a buffer for auto device placement or whatever (tried this when porting) would be more readable
There was a problem hiding this comment.
Let's keep the split in two lines. I prefer not to introduce a variable for auto device placement.
| patch_embeddings = self.vision_embed_tokens( | ||
| image_patches.to(self.vision_embed_tokens.weight.dtype) | ||
| ).to(inputs_embeds.device) | ||
| patch_embeddings = self.vision_embed_tokens(image_patches.to(self.vision_embed_tokens.weight.dtype)) |
There was a problem hiding this comment.
is there another way to do this?
There was a problem hiding this comment.
I need those two embeddings to be on the same device. I can't use _no_split_modules to force those two modules to be on the same device. Moreover, those two modules dont share weights. So there is no way to make sure that they are on the same device if we don't move them in the forward.
There was a problem hiding this comment.
I see. Fine with me!
src/transformers/modeling_utils.py
Outdated
| if hasattr(output_embeddings, "out_features") and hasattr(input_embeddings, "num_embeddings"): | ||
| output_embeddings.out_features = input_embeddings.num_embeddings | ||
|
|
||
| def _get_no_split_modules(self, device_map): |
There was a problem hiding this comment.
let's add tests for this. Also let's think about the priority as well.
Let's document this as well. If the self._split_module is not [] we should not even check the underlying modules? (for efficiency maybe)
Also let's add a check for the CI's to force this to be defines in new models (the _no_split_modules)!
(Love the error!)
There was a problem hiding this comment.
- For the accelerate tests, this will be tested once this PR is merged (inside ModelTesterMixin).
- For the documentation, i will do it.
- If
self._split_module is not [], we should still check the underlying modules. This will become confusing otherwise. Moreover, I don't think that iterating throughmodel.modules()takes a lot of time. - For the CI, I don't know if we should do it. This will slow down the addition of new models. cc @LysandreJik
There was a problem hiding this comment.
Not entirely sure we need to have it defined by default for new models, especially as some models are on the smaller side and don't need it
There was a problem hiding this comment.
OK! What I mean by a test on the ci is to make sure that new models have a no split modules to support accelerate no test the model! 😉 but it might already be there
ArthurZucker
left a comment
There was a problem hiding this comment.
Thanks 🤗 @LysandreJik for a second 👁️
LysandreJik
left a comment
There was a problem hiding this comment.
LATM (looks awesome to me)
* add _no_split_modules * style * fix _no_split_modules * add doc
* add _no_split_modules * style * fix _no_split_modules * add doc
What does this PR do ?
This PR adds the possibility to use fuyu model with
device_map="auto".I ran some tests on my side and it works with multigpu and cpu offload. I am not writing any accelerate tests as they will be contained in a future
FuyuModelTestclass which will inherit fromModelTesterMixin.cc @younesbelkada