-
Notifications
You must be signed in to change notification settings - Fork 27.7k
nn.Orthogonal #42243
Copy link
Copy link
Closed
Labels
featureA request for a proper, new feature.A request for a proper, new feature.module: nnRelated to torch.nnRelated to torch.nnneeds researchWe need to decide whether or not this merits inclusion, based on research worldWe need to decide whether or not this merits inclusion, based on research worldtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Metadata
Metadata
Assignees
Labels
featureA request for a proper, new feature.A request for a proper, new feature.module: nnRelated to torch.nnRelated to torch.nnneeds researchWe need to decide whether or not this merits inclusion, based on research worldWe need to decide whether or not this merits inclusion, based on research worldtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
🚀 Feature
A module
is constrained to be orthogonal, i.e.,
.
nn.Orthogonalsimilar tonn.Linearwhere the weight matrixMotivation
There has been a growing interest in orthogonal parameterization of neural networks, see, e.g., [1,2,3,4,5].
To use orthogonal parameterization with PyTorch one has to implement it themselves or use third party code.
It would be convenient if PyTorch has a built-in module
nn.Orthogonalthat handles everything automatically.In particular, it would be convenient if
nn.Orthogonalsupport different methods by, e.g.,method={fasth,cayley,exp}.Pitch
During ICML I was suggested to make a pull request for PyTorch for fasth [5] as
nn.Orthogonal.I want to
I want
nn.Orthogonalto support three methods: Cayley transform, matrix exponential and fasth.Additional context
The contribution instructions (see screenshot below) states that, generally, algorithms from recently-published research are not accepted, but it is suggested to open an issue, as I have now done.
FastH is up to 20 times faster than the previous sequential algorithm (see image in bottom of page).
Please note this is an algorithmic speed-up, it computes the exact same thing as the previous algorithm, just faster.
References
[1] Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections (ICML 2017)
[2] A Simple Parametrization of the Orthogonal and Unitary Group (ICML 2019)
[3] Stabilizing Gradients forDeep Neural Networks via Efficient SVD Parameterization (ICML 2018)
[4] Trivializations for Gradient-Based Optimization on Manifolds (NeurIPS 2019)
[5] Faster Orthogonal Parameterization with Householder Matrices (ICML Workshop 2020)
cc @albanD @mruberry