Bug Report: UnboundLocalError in Qwen3-Next Linear Attention During Fine-tuning
Summary
UnboundLocalError: cannot access local variable 'state' where it is not associated with a value occurs during Qwen3-Next fine-tuning in MLX-LM v0.28.0.
Environment
- MLX-LM Version: 0.28.0
- Python Version: 3.11.13
- Platform: macOS-15.6.1-arm64-arm-64bit
- Architecture: arm64
- MLX Version: 0.29.1
Model Details
- Model: mlx-community/Qwen3-Next-80B-A3B-Instruct-4bit
- Model Type: qwen3_next
- Source: Hugging Face MLX community conversion
Bug Description
What Happens
During LoRA fine-tuning with mlx_lm.lora, the training crashes with:
UnboundLocalError: cannot access local variable 'state' where it is not associated with a value
Stack Trace
File "mlx_lm/models/qwen3_next.py", line 261, in __call__
out, state = gated_delta_update(q, k, v, a, b, self.A_log, self.dt_bias, state)
^^^^^
UnboundLocalError: cannot access local variable 'state' where it is not associated with a value
Root Cause Analysis
In mlx_lm/models/qwen3_next.py, line 261, the state variable is passed to gated_delta_update() before being initialized in the conditional blocks below.
Problematic Code Flow:
- Line 261:
state is used in function call
- Lines 262-267:
state is conditionally initialized based on cache conditions
- The initialization happens AFTER the usage
Expected Behavior
- Fine-tuning should proceed without crashing
- The
state variable should be properly initialized before use
Actual Behavior
- Training crashes immediately when the linear attention layer is called
- Error prevents any Qwen3-Next fine-tuning
Reproduction Steps
-
Install MLX-LM v0.28.0:
pip install mlx-lm==0.28.0
-
Download Qwen3-Next model:
# Any Qwen3-Next MLX model from mlx-community
-
Attempt fine-tuning:
python -m mlx_lm lora \
--model path/to/qwen3-next-model \
--train \
--data path/to/training/data
-
Observe crash when training begins
Minimal Reproduction Case
from mlx_lm import load
from mlx_lm.tuner import lora
# Load model
model, tokenizer = load("mlx-community/Qwen3-Next-80B-A3B-Instruct-4bit")
# Attempt training (will crash)
# The crash occurs in the forward pass of linear attention
Proposed Fix
The issue appears to be a variable scoping problem. The state variable needs to be initialized before line 261.
Suggested fix location: mlx_lm/models/qwen3_next.py around line 261
Potential solutions:
- Initialize
state = None before the gated_delta_update call
- Restructure the conditional logic to ensure
state is always initialized
- Move the initialization logic before the function call
Impact
- Severity: High - Completely blocks Qwen3-Next fine-tuning
- Scope: All Qwen3-Next models in MLX-LM v0.28.0
- Workaround: None currently available
Additional Context
- Model loading works perfectly (no issues with inference)
- Tokenization and chat templates function correctly
- Issue only occurs during training/fine-tuning
- This is a new feature in v0.28.0, so likely a regression in recent code
Testing Done
- ✅ Confirmed model loads successfully
- ✅ Confirmed tokenization works
- ✅ Confirmed issue is specific to training/fine-tuning
- ✅ Confirmed issue exists across different Qwen3-Next models
- ✅ Confirmed Python 3.11 compatibility (resolved separate issue)
Thank you for the excellent work on MLX-LM! The addition of Qwen3-Next support is very much appreciated. This appears to be a small scoping issue that should be relatively straightforward to fix.
Bug Report: UnboundLocalError in Qwen3-Next Linear Attention During Fine-tuning
Summary
UnboundLocalError: cannot access local variable 'state' where it is not associated with a valueoccurs during Qwen3-Next fine-tuning in MLX-LM v0.28.0.Environment
Model Details
Bug Description
What Happens
During LoRA fine-tuning with
mlx_lm.lora, the training crashes with:Stack Trace
Root Cause Analysis
In
mlx_lm/models/qwen3_next.py, line 261, thestatevariable is passed togated_delta_update()before being initialized in the conditional blocks below.Problematic Code Flow:
stateis used in function callstateis conditionally initialized based on cache conditionsExpected Behavior
statevariable should be properly initialized before useActual Behavior
Reproduction Steps
Install MLX-LM v0.28.0:
Download Qwen3-Next model:
# Any Qwen3-Next MLX model from mlx-communityAttempt fine-tuning:
Observe crash when training begins
Minimal Reproduction Case
Proposed Fix
The issue appears to be a variable scoping problem. The
statevariable needs to be initialized before line 261.Suggested fix location:
mlx_lm/models/qwen3_next.pyaround line 261Potential solutions:
state = Nonebefore the gated_delta_update callstateis always initializedImpact
Additional Context
Testing Done
Thank you for the excellent work on MLX-LM! The addition of Qwen3-Next support is very much appreciated. This appears to be a small scoping issue that should be relatively straightforward to fix.