[Bug] `lm_head` is not trained using LoRA and merging is broken

Latest version of Unsloth (Unsloth 2026.2.1)

After a full training run, I noticed the `adapter_config.json`, the key `target_modules` didn't include the `lm_head`, but the `modules_to_save` did include the `lm_head`, this resulted in this error when attempting to merge:

`RuntimeError: Unsloth: Extracted keys = {'lm_head.weight'} do not match!`

in the `saving_utils` from unsloth_zoo here: [Exact line asserting the issue saving_utils.py#L303](https://github.com/unslothai/unsloth-zoo/blob/984f31194b227efcaf2905c2ebcc1b646d165330/unsloth_zoo/saving_utils.py#L303)

Cause: the `lm_head` is silently getting **filtered out** or **included** somewhere within the code and causing issues down the line later on.


How to reproduce (Reproducible in `Colab`):

- Load the peft module with the `lm_head` in `target_modules` (inside the `get_peft_model`)
  - if there's weight tying involved, it will automatically do the right thing (LoRA on top or not during training)

(Colab Example) [Qwen3-4B-Thinking Notebook](https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/Qwen3_(4B)-Thinking.ipynb); modify the code for the `get_peft_model` 

```
model = FastLanguageModel.get_peft_model(
    model,
    r = 32, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["lm_head", "q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],                 # Added "lm_head" to target_modules
    lora_alpha = 32,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)
```

- Train the model
- Verify the `adapter_config.json` of the resulting trained LoRA adapter (to validate the `lm_head` shows, you'll see no `lm_head` in the `target_modules` within the `adapter_config.json`, but `lm_head` will be in the `modules_to_save`) 
- Attempt merging using `save_pretrained_merged`

Within the same notebook, at the Saving float16 for VLLM, you switch the first if clause to True
```
if True:
    model.save_pretrained_merged("qwen_finetune_16bit", tokenizer, save_method = "merged_16bit",)
```

- and then the specified exception appears

`RuntimeError: Unsloth: Extracted keys = {'lm_head.weight'} do not match!`


Additional related information (What are the expected behaviors):
- https://github.com/huggingface/peft/issues/2864

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] `lm_head` is not trained using LoRA and merging is broken #4098

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug] lm_head is not trained using LoRA and merging is broken #4098

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[Bug] `lm_head` is not trained using LoRA and merging is broken #4098