Adds truncated normal initializer by Enealor · Pull Request #32397 · pytorch/pytorch

Enealor · 2020-01-18T21:02:21Z

This adds the trunc_normal_ function to torch.nn.init which allows for modifying tensors in-place to values drawn from a truncated normal distribution. I chose to use the inverse CDF method to implement this. I have included the appropriate code in test_nn.py for verifying that the values are from the correct distribution.

Reasons I chose this method:

Easily implemented to operate on memory in place, as the other initializers are.
No resampling delays
This method's main weakness is unlikely to be an issue. While the inverse CDF method can fail to generate the correct distribution when b < mean or mean < a, I expect users will choose a and b so that a < mean < b. This method is extremely effective in this case.

kostmo · 2020-01-18T21:51:52Z

💊 CircleCI build failures summary and remediations

As of commit bd41aeb (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no CircleCI failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 19 times.

Enealor · 2020-01-18T23:48:15Z

Addresses #2129 and part of #32293 by adding a truncated normal initializer. @cossio

…c_norm

Enealor · 2020-02-01T01:37:03Z

The current version is passing the available checks. Any suggestions on changes or does it look good? @alicanb @cossio

ezyang · 2020-02-03T15:51:33Z

Tentatively assigning review to @alicanb, let me know if you need someone else to look

alicanb · 2020-02-03T16:28:27Z

I am happy to do it after the ICML deadline :D

alicanb · 2020-02-13T21:03:21Z

This mostly looks good. Maybe one thing I would add is a warning if bounds are too far away from the center (I actually don't know when it will break, it might be a good idea to test that and place the warnings accordingly). I see you're only testing truncnorm(0,1,-2, 2) in test_trunc_normal, it might be nice to add tests for various upper and lower bounds as well.

Enealor · 2020-02-18T18:23:27Z

@alicanb Thanks for the feedback. I experimented with randomly selected a and b for the truncated standard normal. It seems that the most common failure point is to choose mean more than 2 standard deviations from the interval [a, b]. In this situation, the method fails to be statistically similar to Scipy's truncated normal. It can also fail if mean is closer to the interval but b-a is particularly small (around 1e-6).

With that, I will add a warning if either a is 2 standard deviations more than mean or b is 2 standard deviations less than mean. I'll update the documentation to clarify that either mean should be in the interval [a, b], or the distribution may be incorrect for particularly small b-a. I don't have enough info to turn this into a particular warning though.

Below is the code I used to test various results. I have also attached some example failures. The currently chosen range for values of a will typically pass. If a is chosen from the interval [3, 5], failures will start to occur. Similarly, if b is changed so that smaller numbers are more likely, then failures will start to become more frequent.

import torch, scipy, random
from scipy.stats import kstest
from torch.nn import init
def _is_trunc_normal(tensor, mean, std, a, b):
    p_value = scipy.stats.kstest(tensor.flatten().tolist(), 'truncnorm', args=(a, b))[1]
    return p_value


if __name__ == '__main__':
    input_tensor = torch.empty((10, 10, 20))
    for _ in range(1000):
        a = random.uniform(3, 3)
        b = random.uniform(a, a + 1)
        init.trunc_normal_(input_tensor.flatten(), mean=0., std=1., a=a, b=b)

        p_value = _is_trunc_normal(input_tensor, 0., 1., a, b)
        if p_value <= 0.0001:
            print("Failed for interval [{0:.3}, {1:.3}], length {2:.3}".format(a, b, b-a))

# Failed for interval [4.98, 5.18], length 0.202
# Failed for interval [3.87, 3.87], length 0.00138
# Failed for interval [2.89, 2.89], length 0.000154

Enealor · 2020-02-22T19:38:56Z

I updated the warning message to be more descriptive and report at the appropriate stack level. The current message triggers if the mean is too far from the interval. For example,

t = torch.zeros((1, 10))
init.trunc_normal_(t, 3, 0.1, -2, 2) # mean is 3, std is .1, truncated to [-2, 2]
# __main__:1: UserWarning: mean is more than 2 std from [a, b] in nn.init.trunc_normal_.
# The distribution of values may be incorrect.

This reverts commit 07963b9.

ezyang · 2020-03-09T18:23:15Z

@alicanb let me know when to land!

alicanb · 2020-03-19T18:59:44Z

LGTM, sorry I forgot about this.

facebook-github-bot

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

cossio · 2020-03-20T18:21:04Z

Great! So this will be out in the next PyTorch minor version? Any estimate for when that'll be? Thanks!

facebook-github-bot · 2020-03-20T18:25:48Z

@ezyang merged this pull request in 8bcedf7.

Summary: This adds the `trunc_normal_` function to `torch.nn.init` which allows for modifying tensors in-place to values drawn from a truncated normal distribution. I chose to use the inverse CDF method to implement this. I have included the appropriate code in `test_nn.py` for verifying that the values are from the correct distribution. Reasons I chose this method: 1. Easily implemented to operate on memory in place, as the other initializers are. 1. No resampling delays 1. This method's main weakness is unlikely to be an issue. While the inverse CDF method can fail to generate the correct distribution when `b < mean` or `mean < a`, I expect users will choose `a` and `b` so that `a < mean < b`. This method is extremely effective in this case. Pull Request resolved: pytorch#32397 Differential Revision: D20550996 Pulled By: ezyang fbshipit-source-id: 298a325043a3fd7d1e24d266e3b9b6cc14f81829

Enealor added 5 commits January 16, 2020 11:32

fixed trunc normal sampling

4515170

Updated trunc_normal_ documentation

287ac41

Merge branch 'master' into trunc_norm

a4a55aa

Adds test for trunc_norm

1dd2c48

Corrects trunc_norm computation

1b8560d

Enealor requested a review from apaszke as a code owner January 18, 2020 21:02

fixed spacing for lint

5e60a53

pytorchbot added the open source label Jan 18, 2020

Lint, please

288beab

Enealor added 2 commits January 31, 2020 13:23

Simplified code for truncated normal

4e53818

Merge branch 'master' of https://github.com/pytorch/pytorch into trun…

a9c3962

…c_norm

ezyang requested a review from alicanb February 3, 2020 15:51

ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Feb 3, 2020

Enealor added 4 commits February 20, 2020 07:13

Added warning for trunc_normal_

4358b19

Updated trunc_normal_ testing for random a, b

414549a

Merge remote-tracking branch 'upstream/master' into trunc_norm

aab841d

Update warning

f3b52a2

Enealor added 5 commits February 22, 2020 13:58

Moved warning location

85434af

Merge remote-tracking branch 'upstream/master' into trunc_norm

1c7b79e

Merge remote-tracking branch 'upstream/master' into trunc_norm

072e598

Update init.py

07963b9

Revert "Update init.py"

fd90566

This reverts commit 07963b9.

Enealor added 2 commits March 8, 2020 16:15

Merge remote-tracking branch 'upstream/master' into trunc_norm

feef0b0

Update init.py

101b48b

ezyang approved these changes Mar 20, 2020

View reviewed changes

Merge remote-tracking branch 'origin/master' into trunc_norm

bd41aeb

facebook-github-bot reviewed Mar 20, 2020

View reviewed changes

facebook-github-bot closed this in 8bcedf7 Mar 20, 2020

facebook-github-bot added the merged label Mar 20, 2020

hnyu mentioned this pull request May 7, 2020

faster trunc_norm init HorizonRobotics/alf#564

Merged

mruberry added the Merged label Oct 28, 2020

fmassa mentioned this pull request Aug 5, 2021

using np.random.RandomState(seed) instead of np.random.seed(seed) pytorch/vision#4250

Merged

Conversation

Enealor commented Jan 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kostmo commented Jan 18, 2020 • edited by dr-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CircleCI build failures summary and remediations

Uh oh!

Enealor commented Jan 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Enealor commented Feb 1, 2020

Uh oh!

ezyang commented Feb 3, 2020

Uh oh!

alicanb commented Feb 3, 2020

Uh oh!

alicanb commented Feb 13, 2020

Uh oh!

Enealor commented Feb 18, 2020

Uh oh!

Enealor commented Feb 22, 2020

Uh oh!

ezyang commented Mar 9, 2020

Uh oh!

alicanb commented Mar 19, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

cossio commented Mar 20, 2020

Uh oh!

facebook-github-bot commented Mar 20, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Enealor commented Jan 18, 2020 •

edited

Loading

kostmo commented Jan 18, 2020 •

edited by dr-ci Bot

Loading

Enealor commented Jan 18, 2020 •

edited

Loading