[pytorch] Add strong Wolfe line search for lbfgs by fehiepsi · Pull Request #8824 · pytorch/pytorch

fehiepsi · 2018-06-23T14:33:37Z

This pull request adds a line search for lbfgs. "strong Wolfe" is the default line search method in minFunc and it is also recommended in the Numerical Optimization book.

The implementation is based on four sources:

https://www.cs.ubc.ca/~schmidtm/Software/minFunc.html
https://www.springer.com/gp/book/9780387303031 Algorithms 3.5, 3.6, formula 3.59
https://github.com/torch/optim/blob/master/lswolfe.lua
https://github.com/torch/optim/blob/master/polyinterp.lua

The 'lua' version is based on an old version of minFunc, which has been updated in 2012. I made a couple of small changes based on the updated version. Due to that, the test of comparing with .lua version is not consistent (that's is the reason I changed a learning rate in the test).

Differential Revision: D15740107

Sign in to view

                    # store new direction/step
                    old_dirs.append(y)
                    old_stps.append(s)
+                    ro.append(1. / ys)


Sign in to view


+            # directional derivative is below tolerance
+            if gtd > -tolerance_change:
+                break


Sign in to view


-            if d.mul(t).abs_().sum() <= tolerance_change:
+            # lack of progress
+            if d.mul(t).abs().max() <= tolerance_change:


Sign in to view


        flat_grad = self._gather_flat_grad()
-        abs_grad_sum = flat_grad.abs().sum()
+        opt_cond = flat_grad.abs().max() <= tolerance_grad


Sign in to view

+            min_pos = x1 - (x1 - x2) * ((g1 + d2 - d1) / (g1 - g2 + 2 * d2))
+        return min(max(min_pos, xmin_bound), xmax_bound)
+    else:
+        return (xmin_bound + xmax_bound) / 2.


Sign in to view

-            lambda params: optim.LBFGS(params, lr=5e-2, max_iter=5),
-            wrap_old_fn(old_optim.lbfgs, learningRate=5e-2, maxIter=5)
+            lambda params: optim.LBFGS(params, lr=1, max_iter=5),
+            wrap_old_fn(old_optim.lbfgs, learningRate=1, maxIter=5)


ssnl · 2018-06-25T17:38:42Z

I'll review this soon.

ailzhang · 2018-08-28T16:52:48Z

cc: @ssnl

fehiepsi · 2019-06-12T14:34:43Z

Thanks for reviewing @vincentqb !

vincentqb

Thanks for the references! This PR looks good to me :)

facebook-github-bot

@vincentqb is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2019-06-12T19:07:10Z

@vincentqb merged this pull request in ad73ea2.

This has been copied from pytorch/pytorch#8824 PyTorch 1.2.0 has already merged this pull-request and we will incorporate it from the official repository in our code in near future.

Summary: This pull request adds a line search for lbfgs. "strong Wolfe" is the default line search method in [minFunc](https://www.cs.ubc.ca/~schmidtm/Software/minFunc.html) and it is also recommended in the [Numerical Optimization](https://www.springer.com/gp/book/9780387303031) book. The implementation is based on four sources: + https://www.cs.ubc.ca/~schmidtm/Software/minFunc.html + https://www.springer.com/gp/book/9780387303031 Algorithms 3.5, 3.6, formula 3.59 + https://github.com/torch/optim/blob/master/lswolfe.lua + https://github.com/torch/optim/blob/master/polyinterp.lua The 'lua' version is based on an old version of `minFunc`, which has been updated in 2012. I made a couple of small changes based on the updated version. Due to that, the test of comparing with `.lua` version is not consistent (that's is the reason I changed a learning rate in the test). Pull Request resolved: pytorch#8824 Differential Revision: D15783067 Pulled By: vincentqb fbshipit-source-id: 5316d9088233981120376d79c7869d5f97e51b69

fehiepsi added 3 commits June 22, 2018 23:22

init line search

fef459a

add strong Wolfe

2f4a29e

add test

441c2c2

fehiepsi requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners June 23, 2018 14:33

lint

9e92f96