[Bug] UnboundLocalError in Qwen3-Next fine-tuning: 'state'   variable uninitialized in linear attention

# Bug Report: UnboundLocalError in Qwen3-Next Linear Attention During Fine-tuning

## Summary
`UnboundLocalError: cannot access local variable 'state' where it is not associated with a value` occurs during Qwen3-Next fine-tuning in MLX-LM v0.28.0.

## Environment
- **MLX-LM Version**: 0.28.0
- **Python Version**: 3.11.13
- **Platform**: macOS-15.6.1-arm64-arm-64bit
- **Architecture**: arm64
- **MLX Version**: 0.29.1

## Model Details
- **Model**: mlx-community/Qwen3-Next-80B-A3B-Instruct-4bit
- **Model Type**: qwen3_next
- **Source**: Hugging Face MLX community conversion

## Bug Description

### What Happens
During LoRA fine-tuning with `mlx_lm.lora`, the training crashes with:

```
UnboundLocalError: cannot access local variable 'state' where it is not associated with a value
```

### Stack Trace
```
File "mlx_lm/models/qwen3_next.py", line 261, in __call__
    out, state = gated_delta_update(q, k, v, a, b, self.A_log, self.dt_bias, state)
                                                                             ^^^^^
UnboundLocalError: cannot access local variable 'state' where it is not associated with a value
```

### Root Cause Analysis
In `mlx_lm/models/qwen3_next.py`, line 261, the `state` variable is passed to `gated_delta_update()` before being initialized in the conditional blocks below.

**Problematic Code Flow:**
1. Line 261: `state` is used in function call
2. Lines 262-267: `state` is conditionally initialized based on cache conditions
3. The initialization happens AFTER the usage

### Expected Behavior
- Fine-tuning should proceed without crashing
- The `state` variable should be properly initialized before use

### Actual Behavior
- Training crashes immediately when the linear attention layer is called
- Error prevents any Qwen3-Next fine-tuning

## Reproduction Steps

1. **Install MLX-LM v0.28.0**:
   ```bash
   pip install mlx-lm==0.28.0
   ```

2. **Download Qwen3-Next model**:
   ```bash
   # Any Qwen3-Next MLX model from mlx-community
   ```

3. **Attempt fine-tuning**:
   ```bash
   python -m mlx_lm lora \
     --model path/to/qwen3-next-model \
     --train \
     --data path/to/training/data
   ```

4. **Observe crash** when training begins

## Minimal Reproduction Case

```python
from mlx_lm import load
from mlx_lm.tuner import lora

# Load model
model, tokenizer = load("mlx-community/Qwen3-Next-80B-A3B-Instruct-4bit")

# Attempt training (will crash)
# The crash occurs in the forward pass of linear attention
```

## Proposed Fix

The issue appears to be a variable scoping problem. The `state` variable needs to be initialized before line 261.

**Suggested fix location**: `mlx_lm/models/qwen3_next.py` around line 261

**Potential solutions**:
1. Initialize `state = None` before the gated_delta_update call
2. Restructure the conditional logic to ensure `state` is always initialized
3. Move the initialization logic before the function call

## Impact
- **Severity**: High - Completely blocks Qwen3-Next fine-tuning
- **Scope**: All Qwen3-Next models in MLX-LM v0.28.0
- **Workaround**: None currently available

## Additional Context

- Model loading works perfectly (no issues with inference)
- Tokenization and chat templates function correctly
- Issue only occurs during training/fine-tuning
- This is a new feature in v0.28.0, so likely a regression in recent code

## Testing Done

- ✅ Confirmed model loads successfully
- ✅ Confirmed tokenization works
- ✅ Confirmed issue is specific to training/fine-tuning
- ✅ Confirmed issue exists across different Qwen3-Next models
- ✅ Confirmed Python 3.11 compatibility (resolved separate issue)

---

**Thank you for the excellent work on MLX-LM! The addition of Qwen3-Next support is very much appreciated. This appears to be a small scoping issue that should be relatively straightforward to fix.**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] UnboundLocalError in Qwen3-Next fine-tuning: 'state' variable uninitialized in linear attention #481

Bug Report: UnboundLocalError in Qwen3-Next Linear Attention During Fine-tuning

Summary

Environment

Model Details

Bug Description

What Happens

Stack Trace

Root Cause Analysis

Expected Behavior

Actual Behavior

Reproduction Steps

Minimal Reproduction Case

Proposed Fix

Impact

Additional Context

Testing Done

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug] UnboundLocalError in Qwen3-Next fine-tuning: 'state' variable uninitialized in linear attention #481

Description

Bug Report: UnboundLocalError in Qwen3-Next Linear Attention During Fine-tuning

Summary

Environment

Model Details

Bug Description

What Happens

Stack Trace

Root Cause Analysis

Expected Behavior

Actual Behavior

Reproduction Steps

Minimal Reproduction Case

Proposed Fix

Impact

Additional Context

Testing Done

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions