Conversation
torch/nn/functional.py
Outdated
| noise.bernoulli_(keep_prob) | ||
| noise = Variable(noise) | ||
|
|
||
| output = (input * noise).add_(noise.neg().add_(1).mul_(alpha)) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/functional.py
Outdated
|
|
||
|
|
||
| def alpha_dropout(input, p=0.5, training=False): | ||
| if not 0 < p <= 1: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/functional.py
Outdated
| raise ValueError("dropout probability has to be between 0 and 1, " | ||
| "but got {}".format(p)) | ||
|
|
||
| if p > 0 and training: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/functional.py
Outdated
| noise.bernoulli_(keep_prob) | ||
| noise = Variable(noise) | ||
|
|
||
| output = (input * noise).add_(noise.neg().add_(1).mul_(alpha)) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
|
||
| def __init__(self, p=0.5): | ||
| super(AlphaDropout, self).__init__() | ||
| if p < 0 or p > 1: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| input = torch.randn(5000) | ||
|
|
||
| mean = input.mean() | ||
| std = input.std() |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/functional.py
Outdated
| noise.bernoulli_(keep_prob) | ||
| noise = Variable(noise) | ||
|
|
||
| output = (input * noise).add_(noise.neg().add_(1).mul_(alpha)) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
Thanks Francisco! |
| if p == 0 or not training: | ||
| return input | ||
|
|
||
| alpha = -1.7580993408473766 |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
…ating to part of the DAG (pytorch#1775) `MaxInfoPropagator` is renamed to `MaxInfoSpanningTree`, it now only does path-finding, and the propagation is in a separate class `MaxInfoSpanningTree::Propagator`. Same for `MaxRootDomainInfoPropagator`. `MaxInfoSpanningTree` and `MaxRootDomainInfoSpanningTree` now allow specifying a selector, which controls which subgraph should be included in path-finding. `MaxRootDomainInfoSpanningTree` also gets a few new constructors for convenience to use. `TransormPropagator` is now a subclass of `MaxInfoSpanningTree::Propagator`, so the way to use it has changed. Now `MaxInfoSpanningTree` and `MaxRootDomainInfoSpanningTree` will store the path after generation so that the same path can be traversed multiple times. This will be useful to support use cases like new `computeAt`. Pseudo-code: ```C++ void TensorView::computeAt(TensorView tv, int pos) { auto ComputeAtSubgraphSelector selector(this, tv); MaxRootDomainInfoSpanningTree path(tv, pos, &selector); TransformPropagator propagator(tv, pos); path.traverse(&propagator); ComputeAtPosPropagator ca_propagator(tv, pos); path.traverse(&ca_propagator); } ```
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Code changes includes: - TransformPropagator refactor: switched to Dijkstra instead of exhaustive enumeration on all possible paths to reduce compilation time on transform propagation; - Indexing refactor: remove reference tensor creation in all tensor indexing logic (#1690) - (more) generic grouped grid reduction kernel; - Minor parser/fuser patches: 1. zero-dim tensor reduction support 3. no-op binary removal within fused graph 4. expand supported in fusion Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` a054b3e Refactor TransormPropagator to allow specifying a position and propagating to part of the DAG (#1775) d67e1cd Indexing refactor stage 1: remove reference tensor creation in all tensor indexing logic (#1690) 1b65299 Issue 1770 (#1774) 35b0427 Avoid compilation errors like below: (#1773) 452c773 Ignore reductions of zero-dim tensors per PyTorch conventions (#1771) 31d6c56 TransformPropagator refactor (#1769) 570c5a8 Merge pull request #1767 from csarofeen/upstream_merge_0621 9d6c3d8 merging upstream 61305cd 0ed815f New TransformPropagator algorithm (#1763) 6c19520 no-op binary removal (#1764) ec7fa41 Proper propagation of IterType (#1762) b263562 Fix dimensionality check (#1759) 2d6343f More generic grouped grid reduction kernel (#1740) 64e2b56 [nvfuser] prevent spamming warning message (#77777) (#1758) 0c43162 [nvFuser] Improving bitwise ops support (#77158) (#1757) b93a147 Parser expand (#1754) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: #80355 Approved by: https://github.com/davidberard98
Summary: Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Code changes includes: - TransformPropagator refactor: switched to Dijkstra instead of exhaustive enumeration on all possible paths to reduce compilation time on transform propagation; - Indexing refactor: remove reference tensor creation in all tensor indexing logic (#1690) - (more) generic grouped grid reduction kernel; - Minor parser/fuser patches: 1. zero-dim tensor reduction support 3. no-op binary removal within fused graph 4. expand supported in fusion Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` a054b3e Refactor TransormPropagator to allow specifying a position and propagating to part of the DAG (#1775) d67e1cd Indexing refactor stage 1: remove reference tensor creation in all tensor indexing logic (#1690) 1b65299 Issue 1770 (#1774) 35b0427 Avoid compilation errors like below: (#1773) 452c773 Ignore reductions of zero-dim tensors per PyTorch conventions (#1771) 31d6c56 TransformPropagator refactor (#1769) 570c5a8 Merge pull request #1767 from csarofeen/upstream_merge_0621 9d6c3d8 merging upstream 61305cd 0ed815f New TransformPropagator algorithm (#1763) 6c19520 no-op binary removal (#1764) ec7fa41 Proper propagation of IterType (#1762) b263562 Fix dimensionality check (#1759) 2d6343f More generic grouped grid reduction kernel (#1740) 64e2b56 [nvfuser] prevent spamming warning message (#77777) (#1758) 0c43162 [nvFuser] Improving bitwise ops support (#77158) (#1757) b93a147 Parser expand (#1754) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: #80355 Reviewed By: qihqi Differential Revision: D37573400 Pulled By: davidberard98 fbshipit-source-id: 52ab68d89ec01ef61f69f5abeb18c9d3a312aa64
Complements #1768.
Note that I have not added the
fixed_meanandfixed_varparameters from the original paper, as SELU outputs have zero mean and unit std, so I only considered this case. Let me know if you think it's worth adding support for these extra parameters.