Skip to content

Add support for Gemma4ClippableLinear (Gemma 4 QLoRA fails) #3129

@dentity007

Description

@dentity007

Bug Description
QLoRA fine-tuning of google/gemma-4-31B fails because PEFT doesn't recognize Gemma4ClippableLinear as a supported module type.

Error
ValueError: Target module Gemma4ClippableLinear(
(linear): Linear4bit(in_features=1152, out_features=1152, bias=False)
) is not supported. Currently, only the following modules are supported:
torch.nn.Linear, torch.nn.Embedding, torch.nn.Conv1d, torch.nn.Conv2d,
torch.nn.Conv3d, transformers.pytorch_utils.Conv1D, torch.nn.MultiheadAttention..
Root Cause
Gemma4ClippableLinear (used in vision/audio encoder) inherits from nn.Module instead of nn.Linear. The type check in _create_new_module rejects it before exclude_modules can filter it out.

The layer only wraps nn.Linear with optional input/output clamping:

# From transformers/models/gemma4/modeling_gemma4.py
class Gemma4ClippableLinear(nn.Module):
    def __init__(self, config, in_features, out_features):
        super().__init__()
        self.linear = nn.Linear(in_features, out_features, bias=False)

Workaround
Monkey-patch the class to inherit from nn.Linear before loading the model:

import torch.nn as nn
from transformers.models.gemma4 import modeling_gemma4

class PatchedClippableLinear(nn.Linear):
    def __init__(self, config, in_features, out_features):
        nn.Linear.__init__(self, in_features, out_features, bias=False)
        self.use_clipped_linears = getattr(config, "use_clipped_linears", False)
        if self.use_clipped_linears:
            self.register_buffer("input_min", torch.tensor(-float("inf")))
            self.register_buffer("input_max", torch.tensor(float("inf")))
            self.register_buffer("output_min", torch.tensor(-float("inf")))
            self.register_buffer("output_max", torch.tensor(float("inf")))

    def forward(self, x):
        if self.use_clipped_linears:
            x = torch.clamp(x, self.input_min, self.input_max)
        out = nn.Linear.forward(self, x)
        if self.use_clipped_linears:
            out = torch.clamp(out, self.output_min, self.output_max)
        return out

modeling_gemma4.Gemma4ClippableLinear = PatchedClippableLinear

With this patch, QLoRA works normally (534M trainable params, 1.68% of 31.8B).

Suggested Fix
Either: 1. Add Gemma4ClippableLinear to the supported module dispatch in peft/tuners/lora/model.py (dispatch to inner .linear) 2. Move the type check after exclude_modules filtering so excluded modules don't need to be supported types

Environment
peft: 0.18.2.dev0 (installed from source)
transformers: 5.5.0.dev0 (installed from source)
torch: 2.8.0+cu128
bitsandbytes: 0.44.x
GPU: NVIDIA B200 (192GB)
Model: google/gemma-4-31B
Additional Context
Also note: Gemma 4 requires mm_token_type_ids in addition to token_type_ids during training, even for text-only fine-tuning. This requires a custom data collator. This is a separate issue (transformers-side) but worth mentioning as anyone attempting Gemma 4 QLoRA will hit both.

Happy to submit a PR if you'd prefer a specific approach.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions