In your GPT-OSS fine-tuning notebook, you set the target modules:
target_modules = [
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",
],
The MLP modules have different names: gate_up_projs and down_projs. TRL silently ignores this type of issue.
But when I target them, it returns an error saying that they are not supported.
The notebook currently only fine-tunes adapters for the attention modules.
https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-Fine-tuning.ipynb
In your GPT-OSS fine-tuning notebook, you set the target modules:
target_modules = [
"q_proj", "k_proj", "v_proj", "o_proj",
"gate_proj", "up_proj", "down_proj",
],
The MLP modules have different names: gate_up_projs and down_projs. TRL silently ignores this type of issue.
But when I target them, it returns an error saying that they are not supported.
The notebook currently only fine-tunes adapters for the attention modules.
https://colab.research.google.com/github/unslothai/notebooks/blob/main/nb/gpt-oss-(20B)-Fine-tuning.ipynb