Skip to content

to is not supported for 8-bit models  #23336

@lborcard

Description

@lborcard

System Info

Hi,

I am using a Llama model and wanted to add to pipeline class but it throws me an error when building the pipeline class.
Does anyone have a solution to this?
thank you!

@Narsil

Who can help?

@Narsil

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Model

model = AutoModelForCausalLM.from_pretrained(
model_name,
device_map='auto',
load_in_8bit=True,
max_memory=max_memory)

llm class

class CustomLLM(LLM):

pipeline = pipeline("text-generation",tokenizer = tokenizer, model=model, device="cuda:0")

def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
    prompt_length = len(prompt)
    response = self.pipeline(prompt, max_new_tokens=num_output)[0]["generated_text"]

    # only return newly generated tokens
    return response[prompt_length:]

@property
def _identifying_params(self) -> Mapping[str, Any]:
    return {"name_of_model": self.model_name}

@property
def _llm_type(self) -> str:
    return "custom"
        " model has already been set to the correct devices and casted to the correct `dtype`."

Expected behavior

1879 # Checks if the model has been loaded in 8-bit
1880 if getattr(self, "is_loaded_in_8bit", False):
-> 1881 raise ValueError(
1882 ".to is not supported for 8-bit models. Please use the model as it is, since the"
1883 " model has already been set to the correct devices and casted to the correct dtype."

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions