ValueError("8-bit operations on `bitsandbytes` are not supported under CPU!")

Hi Tim,

Thanks for your awesome work!

I'm using your method to load the largest BLOOM model (the BLOOM model with 176b parameters) onto 1 node with 8 GPUs. 

```
model = AutoModelForCausalLM.from_pretrained(
                "bloom", 
                device_map="auto", 
                load_in_8bit=True,
            )
```

This line works for all the other smaller bloom models, eg. bloom-7b1. However when loading `bloom` (176b) I got error `"8-bit operations on bitsandbytes are not supported under CPU!"`. 

```
File "/opt/conda/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 463, in from_pretrained
    return model_class.from_pretrained(
  File "/opt/conda/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2182, in from_pretrained
    raise ValueError("8-bit operations on `bitsandbytes` are not supported under CPU!")
ValueError: 8-bit operations on `bitsandbytes` are not supported under CPU!
```

In my understanding, this is because some modules of the model are automatically loaded onto CPU, which didn't happen to the smaller models. Is there a way to force the model to be loaded to GPU only? or do you have any advice on how to bypass this error? Thanks!!

Tianwei

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ValueError("8-bit operations on `bitsandbytes` are not supported under CPU!") #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ValueError("8-bit operations on bitsandbytes are not supported under CPU!") #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

ValueError("8-bit operations on `bitsandbytes` are not supported under CPU!") #10