Save/loading AdamW optimizer (for hypernetworks) by aria1th · Pull Request #3975 · AUTOMATIC1111/stable-diffusion-webui

aria1th · 2022-10-30T11:52:56Z

since its log is messed up.

Optimizers, especially Adam and its variants are recommended to save and load its state.

This patch offers way to save / load optimizer state, also supports for user-selected optimizer types, such as "SGD", "Adam", etc.

If Selecting optimizer type is enabled, this line has to be changed for safety:
if hypernetwork.optimizer_state_dict:
to, whatever like
if hypernetwork.optimizer_name == hypernetwork_optimizer_type and hypernetwork.optimizer_state_dict

to prevent loading wrong state dict for mismatching optimizer types.

Users will see new option in Training section:

This option should be only enabled when they plan to continue training in future.

Training can continue without saving optimizer state, but some user reported that it was blowing up sometimes when its continued from checkpoint... must by bad luck of optimizer...

For releasing HN, it is recommended to turn off the option (with Apply button) before saving / interrupting training.

Standard (1, 2, 1) network file size comparision is here, it is roughly 3x size difference.

Current Task

Save and load optimizer state dict
People complained about optimizer not resuming properly, it was because we don't save optimizer state dict.
Generalized way to save / load optimizers
This is for generalizing optimizer resuming process. It does not necessarily mean it will offer more optimizer options immediately.

Arilziem · 2022-11-01T20:32:05Z

Now this is just a thought, but would it be possible to save the optimizer state in a separate file next to the hypernetwork? That would remove the need to prune the file afterwards.

Optimizer "sidecars" might also enable reuse of an optimizer state when restarting from scratch? Please disregard if that would not work, I'm not familiar with the technical details.
[Edit: This comment was invisible until 2022-11-11 as my fresh account was flagged at the time of writing, thankfully @aria1th considered it after I emailed them]

AUTOMATIC1111 · 2022-11-02T15:22:47Z

is there any demonstration of the beneft this brings

aria1th · 2022-11-03T05:03:00Z

@AUTOMATIC1111 Yes, AdamW (and Momentum based optimizers) uses adaptive learning rate, which is estimated from its momentum.

If we try to start from zero, AdamW will use given pure learning rate, and observe the effect.

If we resume properly, AdamW uses given trajectory, generally lower learning rate.

Loading optimizer state does not make training deterministic, since it still uses random when its trying to avoid local minima.

Multiple users in discord reported that HN training continuing from saved checkpoints does not works well, especially at beginning, HN sometimes tends to 'die' quickly for some reason. Also its is known that generally lowering learning rate down at 'resuming' helps it.

This should be an optimizer's fault - from not knowing anything about its previous trajectory.

Here's a general discussion about loading optimizers.

With this patch, I never observed drastic style change or transition of preview images like before. But this might mean that someone might want to nuke optimizer state to intentionally trigger style change / transition?

aria1th · 2022-11-03T05:13:51Z

Someone suggested me that if we can have separate file for optimizer, which can be distributed separately.
I think its a good idea, so I'll try working on it.

aria1th · 2022-11-03T05:32:39Z

Finished implementing and testing.

Files will be saved separately as *.pt and *.pt.optim file.

To use valid optimizer, it will now use hash value itself.

Closes #4048

aria1th · 2022-11-04T06:51:49Z

Temporarily closed for resolving conflict

aria1th · 2022-11-04T07:10:17Z

Finished testing again:
HNs file itself won't contain optimizer in any case.

Tested if HN can be saved and loaded with separate optimizer file.
Tested - HNs itself are properly saved without hash in name
Tested - HNs will not use optimizer file if its saved hash is different.
Tested - HNs won't save optimizer file if the option is disabled.

AUTOMATIC1111 · 2022-11-04T07:39:32Z

I think there isn't a scenario where the user would want to put optimizer opts into checkpoint itself when saving to separate file exists. I don't want to have useless options so please remove that and the code to support it.

aria1th · 2022-11-04T08:10:49Z

Yeah I removed that option (merging optimizer in checkpoint itself). Now its only saving it into separate *.optim file.
I'll change explanation in shared.py

Leon-Schoenbrunn · 2022-11-06T18:35:39Z

@aria1th Whenever you load a hypernetwork that also has a .optim file in the hypernetwork directory, the UI says that it not only loads the hypernetwork, but also the optimizer. Is this intended? I'm not trying to resume training with the optimizer, I'm just selecting the hypernetwork for normal inference.

aria1th added 3 commits October 30, 2022 20:40

add optimizer save option to shared.opts

4b8a192

We have duplicate linear now

20194fd

resolve conflicts

9d96d7d

aria1th requested a review from AUTOMATIC1111 as a code owner October 30, 2022 11:52

aria1th added 4 commits November 2, 2022 22:16

resolve conflicts

3178c35

first revert

9b5f85a

Merge branch 'AUTOMATIC1111:master' into force-push-patch-13

10b280e

now add

7ea5956

Separate .optim file from model

0b143c1

aria1th added 4 commits November 3, 2022 14:49

use hash to check valid optim

1764ac3

resolve conflict - first revert

0abb39f

I blame code autocomplete

0d07cbf

Merge branch 'AUTOMATIC1111:master' into force-push-patch-13

179702a

aria1th closed this Nov 4, 2022

aria1th added 3 commits November 4, 2022 15:57

apply

283249d

split before declaring file name

f5d3942

only save if option is enabled

1ca0bcd

aria1th reopened this Nov 4, 2022

Update shared.py

7278897

aria1th mentioned this pull request Nov 4, 2022

Hypernetwork - Variable Dropout Structure #4288

Closed

AUTOMATIC1111 approved these changes Nov 5, 2022

View reviewed changes

AUTOMATIC1111 merged commit e96c434 into AUTOMATIC1111:master Nov 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Save/loading AdamW optimizer (for hypernetworks)#3975

Save/loading AdamW optimizer (for hypernetworks)#3975
AUTOMATIC1111 merged 16 commits intoAUTOMATIC1111:masterfrom
aria1th:force-push-patch-13

aria1th commented Oct 30, 2022

Uh oh!

Arilziem commented Nov 1, 2022 •

edited

Loading

Uh oh!

AUTOMATIC1111 commented Nov 2, 2022

Uh oh!

aria1th commented Nov 3, 2022 •

edited

Loading

Uh oh!

aria1th commented Nov 3, 2022

Uh oh!

aria1th commented Nov 3, 2022 •

edited

Loading

Uh oh!

aria1th commented Nov 4, 2022

Uh oh!

aria1th commented Nov 4, 2022 •

edited

Loading

Uh oh!

AUTOMATIC1111 commented Nov 4, 2022

Uh oh!

aria1th commented Nov 4, 2022

Uh oh!

Leon-Schoenbrunn commented Nov 6, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

aria1th commented Oct 30, 2022

Uh oh!

Arilziem commented Nov 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AUTOMATIC1111 commented Nov 2, 2022

Uh oh!

aria1th commented Nov 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aria1th commented Nov 3, 2022

Uh oh!

aria1th commented Nov 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aria1th commented Nov 4, 2022

Uh oh!

aria1th commented Nov 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AUTOMATIC1111 commented Nov 4, 2022

Uh oh!

aria1th commented Nov 4, 2022

Uh oh!

Leon-Schoenbrunn commented Nov 6, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Arilziem commented Nov 1, 2022 •

edited

Loading

aria1th commented Nov 3, 2022 •

edited

Loading

aria1th commented Nov 3, 2022 •

edited

Loading

aria1th commented Nov 4, 2022 •

edited

Loading