-
-
Notifications
You must be signed in to change notification settings - Fork 6k
[Feature request] Support GPTQ quantization #39
Copy link
Copy link
Open
Labels
feature requestFeature request pending on roadmapFeature request pending on roadmaphelp wantedHelp from the OSS community wanted!Help from the OSS community wanted!on roadmapFeature request on roadmapFeature request on roadmap
Metadata
Metadata
Assignees
Labels
feature requestFeature request pending on roadmapFeature request pending on roadmaphelp wantedHelp from the OSS community wanted!Help from the OSS community wanted!on roadmapFeature request on roadmapFeature request on roadmap
Type
Fields
Give feedbackNo fields configured for issues without a type.
So I have a GPTQ llama model I downloaded (from TheBloke), and it's already 4 bit quantized. I have to pass in False for the load_in_4bit parameter of:
because if I don't, I get an error thrown saying:
But, if I pass in False for load_in_4bit, this code makes bnb_config be None:
and that makes quantization_config be None as well:
and that crashes here:
with the error message:
So I'm not sure how to LoRA train this llama model. Any thoughts?