Add LoRA support for Gemma 4 (Gemma4ClippableLinear)

## Problem

Gemma 4 models (released April 2, 2026) use `Gemma4ClippableLinear` in their vision and audio towers. This module wraps a standard `nn.Linear` with optional input/output clamping but is not recognized by PEFT's LoRA dispatch, causing a `ValueError` when attempting fine-tuning:

```
ValueError: Target module Gemma4ClippableLinear(
  (linear): Linear(in_features=768, out_features=768, bias=False)
) is not supported.
```

This affects all Gemma 4 variants (E2B, E4B, 26B-A4B, 31B) when targeting modules like `q_proj` that appear in both the language model (standard `nn.Linear`) and vision/audio towers (`Gemma4ClippableLinear`).

## Root Cause

`Gemma4ClippableLinear` is an `nn.Module` (not an `nn.Linear` subclass) that wraps a standard `nn.Linear`:

```python
class Gemma4ClippableLinear(nn.Module):
    def __init__(self, config, in_features, out_features):
        super().__init__()
        self.use_clipped_linears = config.use_clipped_linears
        self.linear = nn.Linear(in_features, out_features, bias=False)
        if self.use_clipped_linears:
            self.register_buffer("input_min", torch.tensor(-float("inf")))
            self.register_buffer("input_max", torch.tensor(float("inf")))
            self.register_buffer("output_min", torch.tensor(-float("inf")))
            self.register_buffer("output_max", torch.tensor(float("inf")))

    def forward(self, hidden_states):
        if self.use_clipped_linears:
            hidden_states = torch.clamp(hidden_states, self.input_min, self.input_max)
        hidden_states = self.linear(hidden_states)
        if self.use_clipped_linears:
            hidden_states = torch.clamp(hidden_states, self.output_min, self.output_max)
        return hidden_states
```

Because it doesn't subclass `nn.Linear`, `dispatch_default` doesn't match it, and `_get_in_out_features` can't extract dimensions from it.

## Workaround

Users can work around this today with `exclude_modules`:

```python
lora_config = LoraConfig(
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    exclude_modules=["vision_tower", "audio_tower"],
)
```

This avoids the ClippableLinear modules entirely but prevents fine-tuning vision/audio tower projections.

For users who need to target these modules, a monkey-patch works for basic training (but breaks merge/unmerge):

```python
from peft.tuners.lora import model as lora_model
from transformers.models.gemma4.modeling_gemma4 import Gemma4ClippableLinear

_original = lora_model.LoraModel._create_new_module

@classmethod
def _patch(cls, lora_config, adapter_name, target, **kwargs):
    if isinstance(target, Gemma4ClippableLinear):
        return _original(lora_config, adapter_name, target.linear, **kwargs)
    return _original(lora_config, adapter_name, target, **kwargs)

lora_model.LoraModel._create_new_module = _patch
```

## Analysis of a Proper Fix

We attempted several approaches and identified the code paths that need updating for full support:

### 1. `dispatch_default` in `layer.py`
Add an `elif` for wrapper modules with an inner `.linear`. The wrapper should be passed as `base_layer` (not the inner linear) to preserve the wrapper's forward behavior.

### 2. `_get_in_out_features` in `tuners_utils.py`
Add support for extracting `in_features`/`out_features` from the inner `.linear`.

### 3. `merge()` / `unmerge()` / `merge_and_unload()` in `layer.py`
These access `self.get_base_layer().weight`, which fails for wrappers that don't expose `.weight` directly (it's at `.linear.weight`). Either the wrapper needs a `.weight` property or `get_base_layer()` needs to drill through.

### 4. Weight-based initializers
`olora`, `pissa`, `corda`, `loftq`, `orthogonal`, `lora_ga` all access `self.get_base_layer().weight`. Same issue as merge/unmerge.

## Environment

- transformers 5.5.0
- peft 0.18.2.dev0 (latest main)
- Gemma 4 models: `google/gemma-4-E2B-it`, `google/gemma-4-E4B-it`, `google/gemma-4-26B-A4B-it`, `google/gemma-4-31B-it`

## Tested With

- DPO training on Gemma 4 E2B-it with TRL 1.0.0 (successful with monkey-patch)
- SFT training (not tested but same issue expected)
- GRPO training (not tested but same issue expected)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LoRA support for Gemma 4 (Gemma4ClippableLinear) #3130

Problem

Root Cause

Workaround

Analysis of a Proper Fix

1. `dispatch_default` in `layer.py`

2. `_get_in_out_features` in `tuners_utils.py`

3. `merge()` / `unmerge()` / `merge_and_unload()` in `layer.py`

4. Weight-based initializers

Environment

Tested With

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Add LoRA support for Gemma 4 (Gemma4ClippableLinear) #3130

Description

Problem

Root Cause

Workaround

Analysis of a Proper Fix

1. dispatch_default in layer.py

2. _get_in_out_features in tuners_utils.py

3. merge() / unmerge() / merge_and_unload() in layer.py

4. Weight-based initializers

Environment

Tested With

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. `dispatch_default` in `layer.py`

2. `_get_in_out_features` in `tuners_utils.py`

3. `merge()` / `unmerge()` / `merge_and_unload()` in `layer.py`