[FEAT] Add Neftune into transformers Trainer by younesbelkada · Pull Request #27141 · huggingface/transformers

younesbelkada · 2023-10-30T09:27:36Z

What does this PR do?

As per title

Fixes: huggingface/trl#923
Fixes: #26899

This PR adds NEFTune: a new technique for enhancing Supervised fine-tuning results results proposed in: https://arxiv.org/abs/2310.05914

I propose a very simple API which is as simple as passing a valid neftune_noise_alpha argument when initializing the TrainingArguments. To avoid any surprising behaviour, we should revert to the original forward method at the end of the training. This is handled inside the inner training loop that attaches the correct forward hook before the beginning of training, and makes sure to remove it right after training the model.

HuggingFaceDocBuilderDev · 2023-10-30T09:48:33Z

The documentation is not available anymore as the PR was closed or merged.

muellerzr

Thanks! Overall this looks very good and handy to use. I left a few comments for an initial review :)

src/transformers/training_args.py

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

younesbelkada · 2023-10-30T15:51:21Z

Added a test and a relevant documentation section, this PR is ready for final review!

amyeroberts

Nice work! 💪

Just some small comments. Main one is to add a check for the deactivation logic.

docs/source/en/main_classes/trainer.md

amyeroberts · 2023-10-30T18:29:13Z

src/transformers/trainer.py

+        # After training we make sure to retrieve back the original forward pass method
+        # for the embedding layer by removing the forward post hook.
+        if self.neftune_noise_alpha is not None:
+            if is_peft_available() and isinstance(self.model, PeftModel):
+                embeddings = unwrap_model(self.model.base_model).get_input_embeddings()
+            else:
+                embeddings = unwrap_model(self.model).get_input_embeddings()
+
+            self.neftune_hook_handle.remove()
+            del embeddings.neftune_noise_alpha


Let's make this into an equivalent method _deacivate_neftune

Done in e55ab8b

amyeroberts · 2023-10-30T18:30:26Z

src/transformers/trainer.py

+            if is_peft_available() and isinstance(self.model, PeftModel):
+                embeddings = unwrap_model(self.model.base_model).get_input_embeddings()
+            else:
+                embeddings = unwrap_model(self.model).get_input_embeddings()


Is this logic used anywhere else? It looks general enough that we could have a _get_model_input_embeddings function (not necessarily to be done in this PR)

happy to refactor this in a follow up PR!

amyeroberts · 2023-10-30T18:32:21Z

tests/trainer/test_trainer.py

+
+        # Make sure forward pass works fine
+        _ = trainer.model(torch.LongTensor([[1, 0, 1]]).to(torch_device))
+        self.assertTrue(len(trainer.model.get_input_embeddings()._forward_hooks) == 0)


A check should be made that it's correctly disabled after training has finished

the line

self.assertTrue(len(trainer.model.get_input_embeddings()._forward_hooks) == 0)

Should check if the forward hook as been correctly removed so I think all should be good here

Note also that line is called right after training, so it should check that neftune is correctly disabled after training.

Added slightly more elaborated test in ca8f8c4

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

amyeroberts

Awesome - thanks for iterating!

amyeroberts · 2023-10-30T18:47:46Z

src/transformers/trainer.py

+        if not hasattr(self, "neftune_hook_handle"):
+            raise ValueError("Neftune is not activated make sure to call `trainer._activate_neftune()` first")


* add v1 neftune * use `unwrap_model` instead * add test + docs * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * more details * fixup * Update docs/source/en/main_classes/trainer.md Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * refactor a bit * more elaborated test * fix unwrap issue --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

add v1 neftune

2224214

younesbelkada added 2 commits October 30, 2023 15:17

Merge remote-tracking branch 'upstream/main' into add-neftune

9580e0f

use unwrap_model instead

e35b77d

muellerzr reviewed Oct 30, 2023

View reviewed changes

src/transformers/training_args.py Outdated Show resolved Hide resolved

src/transformers/training_args.py Outdated Show resolved Hide resolved

younesbelkada and others added 3 commits October 30, 2023 15:48

add test + docs

442324e

Apply suggestions from code review

8876347

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

more details

abfd185

younesbelkada marked this pull request as ready for review October 30, 2023 15:51

younesbelkada requested review from amyeroberts and muellerzr October 30, 2023 15:51

fixup

0df41bf

amyeroberts reviewed Oct 30, 2023

View reviewed changes

younesbelkada and others added 3 commits October 30, 2023 19:34

Update docs/source/en/main_classes/trainer.md

fad5821

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

refactor a bit

e55ab8b

more elaborated test

ca8f8c4

amyeroberts approved these changes Oct 30, 2023

View reviewed changes

fix unwrap issue

4e26f00

younesbelkada mentioned this pull request Oct 31, 2023

[SFTTrainer] Make sure to not conflict between transformers and TRL implementation huggingface/trl#933

Merged

younesbelkada merged commit 309a906 into huggingface:main Oct 31, 2023

younesbelkada deleted the add-neftune branch October 31, 2023 15:04

		if not hasattr(self, "neftune_hook_handle"):
		raise ValueError("Neftune is not activated make sure to call `trainer._activate_neftune()` first")

Conversation

younesbelkada commented Oct 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

muellerzr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

younesbelkada commented Oct 30, 2023

Uh oh!

amyeroberts left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amyeroberts left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

younesbelkada commented Oct 30, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 30, 2023 •

edited

Loading