-
Notifications
You must be signed in to change notification settings - Fork 32.4k
Description
System Info
Python 3.10.6
Transformers 4.30.0
Bitsandbytes 0.39.1
Windows / Linux
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Using an 4 or 8bit quantised model such as:
https://huggingface.co/Mediocreatmybest/blip2-opt-2.7b_8bit
Expected behavior
The Pipeline image processor to detect the model is running with a 4 or 8bit model with bitsandbytes.
I apologise if this should be a feature request or if it’s a bug, I couldn’t find any examples of what I was trying to do.
When running through the pipeline examples from the hugging face website, if I try using an 8bit model, the model seems to be detected correctly and casts it to 8bit, but the Processor doesn’t seem to follow suit and runs at its default, throwing an error that they both should be set at the same floating point.
I’ve uploaded a few models set at 8bit to save on size and memory, as BLIP2 is pretty heavy, using it on consumer devices is oviously challenging.
The models I’ve uploaded to HuggingFace are:
Mediocreatmybest/blip2-opt-2.7b_8bit
Mediocreatmybest/blip2-opt-6.7b_8bit
Mediocreatmybest/blip2-flan-t5-xxl_8bit
I can get them working with regular methods, but as I’m a beginner it’s obviously challenging. Thanks again for all the great work!