Skip to content

Change dispatch_model when we have only one device#1648

Merged
SunMarc merged 6 commits intohuggingface:mainfrom
SunMarc:fix_dispatch_one_device
Jun 27, 2023
Merged

Change dispatch_model when we have only one device#1648
SunMarc merged 6 commits intohuggingface:mainfrom
SunMarc:fix_dispatch_one_device

Conversation

@SunMarc
Copy link
Member

@SunMarc SunMarc commented Jun 27, 2023

What does this PR do ?

This PR changes the behavior of the dispatch_model function when the device_map have only one device. We don't attach the hooks anymore and the user will be able to move the model with .to(). This change breaks the case where the device_map is {"":'cpu'} or {"":'disk'} where the hooks will dispatch the weights to the main_device. However, this fix will lead to less issues from users who want to use device_map as a regular device parameter.

@SunMarc SunMarc requested a review from sgugger June 27, 2023 16:11
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 27, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

main_device = "cpu"
# We attach hooks if the device_map have at least 2 different devices. Otherwise, the model in already loaded
# in the unique device and the user can decide where to dispatch the model.
if len(device_map.values()) > 1:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's missing an else with model.to() the only device in the dict.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, we can add that to be sure. However, the model should already be on the right device if the user uses load_checkpoint_in_model or load the model from from_pretrained method

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the user might be calling dispatch_model by themselves.

@SunMarc SunMarc merged commit 5ea7c81 into huggingface:main Jun 27, 2023
@SunMarc SunMarc deleted the fix_dispatch_one_device branch June 27, 2023 18:58
@zero1zero
Copy link

zero1zero commented Jun 27, 2023

Just for googlers, this presents as this exception when the change is pulled down and using device_map like below:

Traceback (most recent call last)

[<ipython-input-3-34cc1538f877>](https://localhost:8080/#) in <cell line: 14>()
     12 
     13 tokenizer = AutoTokenizer.from_pretrained(model_id)
---> 14 model = AutoModelForCausalLM.from_pretrained(model_id, 
     15                                              quantization_config=bnb_config,
     16                                              device_map={"":0}, trust_remote_code=True)

2 frames

[/usr/local/lib/python3.10/dist-packages/accelerate/big_modeling.py](https://localhost:8080/#) in dispatch_model(model, device_map, main_device, state_dict, offload_dir, offload_index, offload_buffers, skip_keys, preload_module_classes)
    387         retie_parameters(model, tied_params)
    388     else:
--> 389         device = device_map.values()[0]
    390         if device != "disk":
    391             model.to(device)

TypeError: 'dict_values' object is not subscriptable

@SunMarc SunMarc mentioned this pull request Jun 27, 2023
@SunMarc
Copy link
Member Author

SunMarc commented Jun 27, 2023

Thx for testing @zero1zero. Should be fixed in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants