Bug Description
QLoRA fine-tuning of google/gemma-4-31B fails because PEFT doesn't recognize Gemma4ClippableLinear as a supported module type.
Error
ValueError: Target module Gemma4ClippableLinear(
(linear): Linear4bit(in_features=1152, out_features=1152, bias=False)
) is not supported. Currently, only the following modules are supported:
torch.nn.Linear, torch.nn.Embedding, torch.nn.Conv1d, torch.nn.Conv2d,
torch.nn.Conv3d, transformers.pytorch_utils.Conv1D, torch.nn.MultiheadAttention..
Root Cause
Gemma4ClippableLinear (used in vision/audio encoder) inherits from nn.Module instead of nn.Linear. The type check in _create_new_module rejects it before exclude_modules can filter it out.
The layer only wraps nn.Linear with optional input/output clamping:
# From transformers/models/gemma4/modeling_gemma4.py
class Gemma4ClippableLinear(nn.Module):
def __init__(self, config, in_features, out_features):
super().__init__()
self.linear = nn.Linear(in_features, out_features, bias=False)
Workaround
Monkey-patch the class to inherit from nn.Linear before loading the model:
import torch.nn as nn
from transformers.models.gemma4 import modeling_gemma4
class PatchedClippableLinear(nn.Linear):
def __init__(self, config, in_features, out_features):
nn.Linear.__init__(self, in_features, out_features, bias=False)
self.use_clipped_linears = getattr(config, "use_clipped_linears", False)
if self.use_clipped_linears:
self.register_buffer("input_min", torch.tensor(-float("inf")))
self.register_buffer("input_max", torch.tensor(float("inf")))
self.register_buffer("output_min", torch.tensor(-float("inf")))
self.register_buffer("output_max", torch.tensor(float("inf")))
def forward(self, x):
if self.use_clipped_linears:
x = torch.clamp(x, self.input_min, self.input_max)
out = nn.Linear.forward(self, x)
if self.use_clipped_linears:
out = torch.clamp(out, self.output_min, self.output_max)
return out
modeling_gemma4.Gemma4ClippableLinear = PatchedClippableLinear
With this patch, QLoRA works normally (534M trainable params, 1.68% of 31.8B).
Suggested Fix
Either: 1. Add Gemma4ClippableLinear to the supported module dispatch in peft/tuners/lora/model.py (dispatch to inner .linear) 2. Move the type check after exclude_modules filtering so excluded modules don't need to be supported types
Environment
peft: 0.18.2.dev0 (installed from source)
transformers: 5.5.0.dev0 (installed from source)
torch: 2.8.0+cu128
bitsandbytes: 0.44.x
GPU: NVIDIA B200 (192GB)
Model: google/gemma-4-31B
Additional Context
Also note: Gemma 4 requires mm_token_type_ids in addition to token_type_ids during training, even for text-only fine-tuning. This requires a custom data collator. This is a separate issue (transformers-side) but worth mentioning as anyone attempting Gemma 4 QLoRA will hit both.
Happy to submit a PR if you'd prefer a specific approach.
Bug Description
QLoRA fine-tuning of google/gemma-4-31B fails because PEFT doesn't recognize Gemma4ClippableLinear as a supported module type.
Error
ValueError: Target module Gemma4ClippableLinear(
(linear): Linear4bit(in_features=1152, out_features=1152, bias=False)
) is not supported. Currently, only the following modules are supported:
torch.nn.Linear,torch.nn.Embedding,torch.nn.Conv1d,torch.nn.Conv2d,torch.nn.Conv3d,transformers.pytorch_utils.Conv1D,torch.nn.MultiheadAttention..Root Cause
Gemma4ClippableLinear (used in vision/audio encoder) inherits from nn.Module instead of nn.Linear. The type check in _create_new_module rejects it before exclude_modules can filter it out.
The layer only wraps nn.Linear with optional input/output clamping:
Workaround
Monkey-patch the class to inherit from nn.Linear before loading the model:
With this patch, QLoRA works normally (534M trainable params, 1.68% of 31.8B).
Suggested Fix
Either: 1. Add Gemma4ClippableLinear to the supported module dispatch in peft/tuners/lora/model.py (dispatch to inner .linear) 2. Move the type check after exclude_modules filtering so excluded modules don't need to be supported types
Environment
peft: 0.18.2.dev0 (installed from source)
transformers: 5.5.0.dev0 (installed from source)
torch: 2.8.0+cu128
bitsandbytes: 0.44.x
GPU: NVIDIA B200 (192GB)
Model: google/gemma-4-31B
Additional Context
Also note: Gemma 4 requires mm_token_type_ids in addition to token_type_ids during training, even for text-only fine-tuning. This requires a custom data collator. This is a separate issue (transformers-side) but worth mentioning as anyone attempting Gemma 4 QLoRA will hit both.
Happy to submit a PR if you'd prefer a specific approach.