Skip to content

Add fuyu device map#26949

Merged
SunMarc merged 4 commits intohuggingface:mainfrom
SunMarc:add_fuyu_device_map
Oct 24, 2023
Merged

Add fuyu device map#26949
SunMarc merged 4 commits intohuggingface:mainfrom
SunMarc:add_fuyu_device_map

Conversation

@SunMarc
Copy link
Member

@SunMarc SunMarc commented Oct 19, 2023

What does this PR do ?

This PR adds the possibility to use fuyu model with device_map="auto".
I ran some tests on my side and it works with multigpu and cpu offload. I am not writing any accelerate tests as they will be contained in a future FuyuModelTest class which will inherit from ModelTesterMixin.

cc @younesbelkada

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 19, 2023

The documentation is not available anymore as the PR was closed or merged.

inputs_embeds = self.language_model.get_input_embeddings()(input_ids)
if image_patches is not None and past_key_values is None:
patch_embeddings = self.vision_embed_tokens(image_patches.to(self.vision_embed_tokens.weight.dtype))
patch_embeddings = self.vision_embed_tokens(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s try to split in two lines. Also if we can register a buffer for auto device placement or whatever (tried this when porting) would be more readable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the split in two lines. I prefer not to introduce a variable for auto device placement.

@SunMarc SunMarc requested a review from ArthurZucker October 23, 2023 15:17
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot better!

patch_embeddings = self.vision_embed_tokens(
image_patches.to(self.vision_embed_tokens.weight.dtype)
).to(inputs_embeds.device)
patch_embeddings = self.vision_embed_tokens(image_patches.to(self.vision_embed_tokens.weight.dtype))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there another way to do this?

Copy link
Member Author

@SunMarc SunMarc Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need those two embeddings to be on the same device. I can't use _no_split_modules to force those two modules to be on the same device. Moreover, those two modules dont share weights. So there is no way to make sure that they are on the same device if we don't move them in the forward.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Fine with me!

if hasattr(output_embeddings, "out_features") and hasattr(input_embeddings, "num_embeddings"):
output_embeddings.out_features = input_embeddings.num_embeddings

def _get_no_split_modules(self, device_map):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add tests for this. Also let's think about the priority as well.
Let's document this as well. If the self._split_module is not [] we should not even check the underlying modules? (for efficiency maybe)
Also let's add a check for the CI's to force this to be defines in new models (the _no_split_modules)!

(Love the error!)

Copy link
Member Author

@SunMarc SunMarc Oct 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • For the accelerate tests, this will be tested once this PR is merged (inside ModelTesterMixin).
  • For the documentation, i will do it.
  • If self._split_module is not [], we should still check the underlying modules. This will become confusing otherwise. Moreover, I don't think that iterating through model.modules() takes a lot of time.
  • For the CI, I don't know if we should do it. This will slow down the addition of new models. cc @LysandreJik

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely sure we need to have it defined by default for new models, especially as some models are on the smaller side and don't need it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK! What I mean by a test on the ci is to make sure that new models have a no split modules to support accelerate no test the model! 😉 but it might already be there

@SunMarc SunMarc requested a review from ArthurZucker October 23, 2023 20:23
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 🤗 @LysandreJik for a second 👁️

Copy link
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LATM (looks awesome to me)

@SunMarc SunMarc merged commit 41496b9 into huggingface:main Oct 24, 2023
i4never pushed a commit to i4never/transformers that referenced this pull request Oct 25, 2023
* add _no_split_modules

* style

* fix _no_split_modules

* add doc
EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 19, 2023
* add _no_split_modules

* style

* fix _no_split_modules

* add doc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants