Optional expand=True kwarg in distribution.enumerate_support by neerajprad · Pull Request #11231 · pytorch/pytorch

neerajprad · 2018-09-04T18:02:56Z

This adds an optional expand=True kwarg to the distribution.expand_support() method, to get a distribution's support without expanding the values over the distribution's batch_shape.

The default expand=True preserves the current behavior, whereas expand=False collapses the batch dimensions.

e.g.

In [47]: d = dist.OneHotCategorical(torch.ones(3, 5) * 0.5)

In [48]: d.batch_shape
Out[48]: torch.Size([3])

In [49]: d.enumerate_support()
Out[49]:
tensor([[[1., 0., 0., 0., 0.],
         [1., 0., 0., 0., 0.],
         [1., 0., 0., 0., 0.]],

        [[0., 1., 0., 0., 0.],
         [0., 1., 0., 0., 0.],
         [0., 1., 0., 0., 0.]],

        [[0., 0., 1., 0., 0.],
         [0., 0., 1., 0., 0.],
         [0., 0., 1., 0., 0.]],

        [[0., 0., 0., 1., 0.],
         [0., 0., 0., 1., 0.],
         [0., 0., 0., 1., 0.]],

        [[0., 0., 0., 0., 1.],
         [0., 0., 0., 0., 1.],
         [0., 0., 0., 0., 1.]]])

In [50]: d.enumerate_support().shape
Out[50]: torch.Size([5, 3, 5])

In [51]: d.enumerate_support(expand=False)
Out[51]:
tensor([[[1., 0., 0., 0., 0.]],

        [[0., 1., 0., 0., 0.]],

        [[0., 0., 1., 0., 0.]],

        [[0., 0., 0., 1., 0.]],

        [[0., 0., 0., 0., 1.]]])

In [52]: d.enumerate_support(expand=False).shape
Out[52]: torch.Size([5, 1, 5])

Motivation:

Currently enumerate_support builds up tensors of size support + batch_shape + event_shape, but the values are repeated over the batch_shape (adding little in the way of information). This can lead to expensive matrix operations over large tensors when batch_shape is large (see, example above), often leading to OOM issues. We use expand=False in Pyro for message passing inference. e.g. when enumerating over the state space in a Hidden Markov Model. This creates sparse tensors that capture the markov dependence, and allows for the possibility of using optimized matrix operations over these sparse tensors. expand=True, on the other hand, will create tensors that scale exponentially in size with the length of the Markov chain.
We have been using this in our patch of torch.distributions in Pyro. The interface has been stable, and it is already being used in a few Pyro algorithms. We think that this is more broadly applicable and will be of interest to the larger distributions community.

cc. @apaszke, @fritzo, @alicanb

fritzo

Thanks for moving this upstream!

neerajprad · 2018-09-06T19:19:35Z

For some reason, the Test and Push jobs are all failing with the same error in test_jit, but it seems unrelated to the PR.

facebook-github-bot

soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

…#11231) Summary: This adds an optional `expand=True` kwarg to the `distribution.expand_support()` method, to get a distribution's support without expanding the values over the distribution's `batch_shape`. - The default `expand=True` preserves the current behavior, whereas `expand=False` collapses the batch dimensions. e.g. ```python In [47]: d = dist.OneHotCategorical(torch.ones(3, 5) * 0.5) In [48]: d.batch_shape Out[48]: torch.Size([3]) In [49]: d.enumerate_support() Out[49]: tensor([[[1., 0., 0., 0., 0.], [1., 0., 0., 0., 0.], [1., 0., 0., 0., 0.]], [[0., 1., 0., 0., 0.], [0., 1., 0., 0., 0.], [0., 1., 0., 0., 0.]], [[0., 0., 1., 0., 0.], [0., 0., 1., 0., 0.], [0., 0., 1., 0., 0.]], [[0., 0., 0., 1., 0.], [0., 0., 0., 1., 0.], [0., 0., 0., 1., 0.]], [[0., 0., 0., 0., 1.], [0., 0., 0., 0., 1.], [0., 0., 0., 0., 1.]]]) In [50]: d.enumerate_support().shape Out[50]: torch.Size([5, 3, 5]) In [51]: d.enumerate_support(expand=False) Out[51]: tensor([[[1., 0., 0., 0., 0.]], [[0., 1., 0., 0., 0.]], [[0., 0., 1., 0., 0.]], [[0., 0., 0., 1., 0.]], [[0., 0., 0., 0., 1.]]]) In [52]: d.enumerate_support(expand=False).shape Out[52]: torch.Size([5, 1, 5]) ``` **Motivation:** - Currently `enumerate_support` builds up tensors of size `support + batch_shape + event_shape`, but the values are *repeated* over the `batch_shape` (adding little in the way of information). This can lead to expensive matrix operations over large tensors when `batch_shape` is large (see, example above), often leading to OOM issues. We use `expand=False` in Pyro for message passing inference. e.g. when enumerating over the state space in a Hidden Markov Model. This creates sparse tensors that capture the markov dependence, and allows for the possibility of using optimized matrix operations over these sparse tensors. `expand=True`, on the other hand, will create tensors that scale exponentially in size with the length of the Markov chain. - We have been using this in our [patch](https://github.com/uber/pyro/blob/dev/pyro/distributions/torch.py) of `torch.distributions` in Pyro. The interface has been stable, and it is already being used in a few Pyro algorithms. We think that this is more broadly applicable and will be of interest to the larger distributions community. cc. apaszke, fritzo, alicanb Pull Request resolved: pytorch#11231 Differential Revision: D9696290 Pulled By: soumith fbshipit-source-id: c556f8ff374092e8366897ebe3f3b349538d9318

neerajprad added 2 commits September 4, 2018 09:47

Optional expand=True kward to distribution.enumerate_support

7ea7728

clarify comment

ed4cf99

neerajprad requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners September 4, 2018 18:02

neerajprad changed the title ~~Optional expand=True kward to distribution.enumerate_support~~ Optional expand=True kwarg in distribution.enumerate_support Sep 4, 2018

fix test

b00afe4

fritzo approved these changes Sep 4, 2018

View reviewed changes

clarify in docstring

cdae347

soumith approved these changes Sep 7, 2018

View reviewed changes

facebook-github-bot reviewed Sep 7, 2018

View reviewed changes

facebook-github-bot closed this in b3b1e76 Sep 7, 2018

neerajprad mentioned this pull request Sep 11, 2018

Move .enumerate(expand=False) logic upstream pyro-ppl/pyro#1336

Closed

neerajprad mentioned this pull request Sep 19, 2018

Certain extensions to the torch.distributions API #10925

Closed

ezyang added open source merged labels Jun 24, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optional expand=True kwarg in distribution.enumerate_support#11231

Optional expand=True kwarg in distribution.enumerate_support#11231
neerajprad wants to merge 4 commits intopytorch:masterfrom
neerajprad:enumerate-expand

neerajprad commented Sep 4, 2018 •

edited

Loading

Uh oh!

fritzo left a comment

Uh oh!

neerajprad commented Sep 6, 2018

Uh oh!

facebook-github-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

neerajprad commented Sep 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fritzo left a comment

Choose a reason for hiding this comment

Uh oh!

neerajprad commented Sep 6, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

neerajprad commented Sep 4, 2018 •

edited

Loading