Add Alpha Dropout by fmassa · Pull Request #1775 · pytorch/pytorch

fmassa · 2017-06-11T12:36:05Z

Complements #1768.

Note that I have not added the fixed_mean and fixed_var parameters from the original paper, as SELU outputs have zero mean and unit std, so I only considered this case. Let me know if you think it's worth adding support for these extra parameters.

torch/nn/functional.py

+        noise.bernoulli_(keep_prob)
+        noise = Variable(noise)
+
+        output = (input * noise).add_(noise.neg().add_(1).mul_(alpha))


torch/nn/functional.py



+def alpha_dropout(input, p=0.5, training=False):
+    if not 0 < p <= 1:


torch/nn/functional.py

+        raise ValueError("dropout probability has to be between 0 and 1, "
+                         "but got {}".format(p))
+
+    if p > 0 and training:


torch/nn/functional.py

+        noise.bernoulli_(keep_prob)
+        noise = Variable(noise)
+
+        output = (input * noise).add_(noise.neg().add_(1).mul_(alpha))


torch/nn/modules/dropout.py

+
+    def __init__(self, p=0.5):
+        super(AlphaDropout, self).__init__()
+        if p < 0 or p > 1:


test/test_nn.py

+        input = torch.randn(5000)
+
+        mean = input.mean()
+        std = input.std()


torch/nn/functional.py

+    noise.bernoulli_(keep_prob)
+    noise = Variable(noise)
+
+    output = (input * noise).add_(noise.neg().add_(1).mul_(alpha))


apaszke · 2017-06-12T22:39:57Z

Thanks Francisco!

torch/nn/functional.py

+    if p == 0 or not training:
+        return input
+
+    alpha = -1.7580993408473766


…ating to part of the DAG (pytorch#1775) `MaxInfoPropagator` is renamed to `MaxInfoSpanningTree`, it now only does path-finding, and the propagation is in a separate class `MaxInfoSpanningTree::Propagator`. Same for `MaxRootDomainInfoPropagator`. `MaxInfoSpanningTree` and `MaxRootDomainInfoSpanningTree` now allow specifying a selector, which controls which subgraph should be included in path-finding. `MaxRootDomainInfoSpanningTree` also gets a few new constructors for convenience to use. `TransormPropagator` is now a subclass of `MaxInfoSpanningTree::Propagator`, so the way to use it has changed. Now `MaxInfoSpanningTree` and `MaxRootDomainInfoSpanningTree` will store the path after generation so that the same path can be traversed multiple times. This will be useful to support use cases like new `computeAt`. Pseudo-code: ```C++ void TensorView::computeAt(TensorView tv, int pos) { auto ComputeAtSubgraphSelector selector(this, tv); MaxRootDomainInfoSpanningTree path(tv, pos, &selector); TransformPropagator propagator(tv, pos); path.traverse(&propagator); ComputeAtPosPropagator ca_propagator(tv, pos); path.traverse(&ca_propagator); } ```

Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Code changes includes: - TransformPropagator refactor: switched to Dijkstra instead of exhaustive enumeration on all possible paths to reduce compilation time on transform propagation; - Indexing refactor: remove reference tensor creation in all tensor indexing logic (#1690) - (more) generic grouped grid reduction kernel; - Minor parser/fuser patches: 1. zero-dim tensor reduction support 3. no-op binary removal within fused graph 4. expand supported in fusion Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` a054b3e Refactor TransormPropagator to allow specifying a position and propagating to part of the DAG (#1775) d67e1cd Indexing refactor stage 1: remove reference tensor creation in all tensor indexing logic (#1690) 1b65299 Issue 1770 (#1774) 35b0427 Avoid compilation errors like below: (#1773) 452c773 Ignore reductions of zero-dim tensors per PyTorch conventions (#1771) 31d6c56 TransformPropagator refactor (#1769) 570c5a8 Merge pull request #1767 from csarofeen/upstream_merge_0621 9d6c3d8 merging upstream 61305cd 0ed815f New TransformPropagator algorithm (#1763) 6c19520 no-op binary removal (#1764) ec7fa41 Proper propagation of IterType (#1762) b263562 Fix dimensionality check (#1759) 2d6343f More generic grouped grid reduction kernel (#1740) 64e2b56 [nvfuser] prevent spamming warning message (#77777) (#1758) 0c43162 [nvFuser] Improving bitwise ops support (#77158) (#1757) b93a147 Parser expand (#1754) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: #80355 Approved by: https://github.com/davidberard98

Summary: Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Code changes includes: - TransformPropagator refactor: switched to Dijkstra instead of exhaustive enumeration on all possible paths to reduce compilation time on transform propagation; - Indexing refactor: remove reference tensor creation in all tensor indexing logic (#1690) - (more) generic grouped grid reduction kernel; - Minor parser/fuser patches: 1. zero-dim tensor reduction support 3. no-op binary removal within fused graph 4. expand supported in fusion Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` a054b3e Refactor TransormPropagator to allow specifying a position and propagating to part of the DAG (#1775) d67e1cd Indexing refactor stage 1: remove reference tensor creation in all tensor indexing logic (#1690) 1b65299 Issue 1770 (#1774) 35b0427 Avoid compilation errors like below: (#1773) 452c773 Ignore reductions of zero-dim tensors per PyTorch conventions (#1771) 31d6c56 TransformPropagator refactor (#1769) 570c5a8 Merge pull request #1767 from csarofeen/upstream_merge_0621 9d6c3d8 merging upstream 61305cd 0ed815f New TransformPropagator algorithm (#1763) 6c19520 no-op binary removal (#1764) ec7fa41 Proper propagation of IterType (#1762) b263562 Fix dimensionality check (#1759) 2d6343f More generic grouped grid reduction kernel (#1740) 64e2b56 [nvfuser] prevent spamming warning message (#77777) (#1758) 0c43162 [nvFuser] Improving bitwise ops support (#77158) (#1757) b93a147 Parser expand (#1754) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: #80355 Reviewed By: qihqi Differential Revision: D37573400 Pulled By: davidberard98 fbshipit-source-id: 52ab68d89ec01ef61f69f5abeb18c9d3a312aa64

Add Alpha Dropout

7ab4b5f

fmassa commented Jun 11, 2017

View reviewed changes

apaszke reviewed Jun 11, 2017

View reviewed changes

Simplify code and fix conditionals

bd4342e

apaszke reviewed Jun 11, 2017

View reviewed changes

torch/nn/functional.py Outdated

noise.bernoulli_(keep_prob)

noise = Variable(noise)

output = (input * noise).add_(noise.neg().add_(1).mul_(alpha))

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

Optimize computation of the masked input in Alpha Dropout

c06c769

apaszke approved these changes Jun 12, 2017

View reviewed changes

apaszke merged commit 6626881 into pytorch:master Jun 12, 2017

andreh7 reviewed Jul 25, 2017

View reviewed changes

torch/nn/functional.py

if p == 0 or not training:

return input

alpha = -1.7580993408473766

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

fmassa deleted the alpha_dropout branch July 25, 2017 19:49

khushi-411 mentioned this pull request Nov 2, 2022

[primTorch] _refs & opinfo alpha_dropout #87989

Closed



		def alpha_dropout(input, p=0.5, training=False):
		if not 0 < p <= 1:

Conversation

fmassa commented Jun 11, 2017

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

apaszke commented Jun 12, 2017

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants