Skip to content

KeyError: lm_head.weight in GemmaForCausalLM.load_weights when loading finetuned Gemma 2B #3323

@patleeman

Description

@patleeman

Hello,

I finetuned Gemma 2B with Unsloth. It uses LoRA and merges the weights back into the base model.

When I try to load this model, it gives me the following error:

...
File "/home/ubuntu/projects/cql-ml/.venv/lib/python3.10/site-packages/vlm/model_executor/model_loader.py", line 86, in get _model model. load weights(model_config.model, model_config.download_
config. model, model_ config. download dir,
File "/home/ubuntu/projects/cql-ml/.venv/lib/python3.10/site-packages/vlm/model_executor/models/gemma.py", line 339, in load weights
param = params_dict [name]
KeyError: 'lm_head.weight'

My pytorch_model.bin.index.json looks like this:

{
  "metadata": {
    "total_size": 6060920832
  },
  "weight_map": {
    "lm_head.weight": "pytorch_model-00002-of-00002.bin",
    "model.embed_tokens.weight": "pytorch_model-00001-of-00002.bin",
    "model.layers.0.input_layernorm.weight": "pytorch_model-00001-of-00002.bin",
    "model.layers.0.mlp.down_proj.weight": "pytorch_model-00001-of-00002.bin",
    "model.layers.0.mlp.gate_proj.weight": "pytorch_model-00001-of-00002.bin",
...

I saw in a few of the other classes a similar check for lm_head.weight so I replicated it in load_weights and the model loads correctly and works as intended. 1 2 3

The modified load_weights function:

1333322

I'm not sure if this is an issue with vllm, or an issue with the output of Unsloth. The model works correctly when load_weights is modified. I don't know what the internals of the model should look like. Any help would be appreciated!

I'm unsure if this is related to #2816

My model is Private, so unfortunately I can't share it. However I found this other model on huggingface that's trained with the same tool with the lm_head.weight in the index.

If the modified load_weights function is the desired fix, I can submit a PR if that will help.

Thank you for the help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions