Skip to content

Adds Cyclical Learning Rate and Momentum#18001

Closed
sampepose wants to merge 1 commit intopytorch:masterfrom
sampepose:cyclic
Closed

Adds Cyclical Learning Rate and Momentum#18001
sampepose wants to merge 1 commit intopytorch:masterfrom
sampepose:cyclic

Conversation

@sampepose
Copy link
Contributor

This implements a cyclical learning rate (CLR) schedule with an optional inverse cyclical momentum. More info about CLR: https://github.com/bckenstler/CLR

This is finishing what #2016 started. Resolves #1909.

@soumith
Copy link
Collaborator

soumith commented Mar 13, 2019

thanks Sam (and @thomasjpfan too). Let me know if you need a review / re-review.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sampepose has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@sampepose sampepose force-pushed the cyclic branch 2 times, most recently from b4fdfbf to 3deec2c Compare March 14, 2019 00:38
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sampepose has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@sampepose
Copy link
Contributor Author

Hey @soumith! I'd appreciate a review, thank you!

@thomasjpfan
Copy link
Contributor

It would be good if this PR or #2016 gets merged. (I started #2016 in 2017) 😄

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sampepose has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@sampepose
Copy link
Contributor Author

ping @soumith :)

Copy link
Collaborator

@soumith soumith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments are minor, but need necessary changes.

Thanks, the PR looks pretty good.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to not have a default for the base learning rate. It makes it too easy to shoot oneself in the foot

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You either want to make the max_lr default be a multiple of base_lr, or simply remove a default

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

" mode parameter is ignored" -> "If specified, then mode is ignored"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what exactly does -1 do here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, it's not clear what this is used for, explaining what the index of the last batch is, or where / why it is used will help

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be:

train_batch(...)
scheduler.step()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments above about default values

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean step size_down = step_size_down if step_size_down is not None else step_size_up

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems safer to actually convert step_size_up and step_size_down to float immediately on entry into the function.

Copy link
Collaborator

@soumith soumith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. thank you!

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sampepose has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@sampepose sampepose force-pushed the cyclic branch 2 times, most recently from 5d1a730 to 00fa02c Compare March 27, 2019 18:37
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sampepose has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sampepose has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@sampepose merged this pull request in 8635078.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Cyclical Learning Rates

4 participants