Skip to content

Add batched grad testing to gradcheck, turn it on in test_autograd#49120

Closed
zou3519 wants to merge 13 commits intogh/zou3519/336/basefrom
gh/zou3519/336/head
Closed

Add batched grad testing to gradcheck, turn it on in test_autograd#49120
zou3519 wants to merge 13 commits intogh/zou3519/336/basefrom
gh/zou3519/336/head

Conversation

@zou3519
Copy link
Copy Markdown
Contributor

@zou3519 zou3519 commented Dec 9, 2020

Stack from ghstack:

This adds a check_batched_grad=False option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:

  • runs grad twice with two grad outputs, a & b
  • runs a vmapped grad with torch.stack([a, b])
  • compares the results of the above against each other.

Furthermore:

  • check_batched_grad=True is set to be the default for
    gradcheck/gradgradcheck inside of test_autograd.py. This is done by
    reassigning to the gradcheck object inside test_autograd
  • I manually added check_batched_grad=False to gradcheck instances
    that don't support batched grad.
  • I added a denylist for operations that don't support batched grad.

Question:

  • Should we have a testing only gradcheck (e.g.,
    torch.testing.gradcheck) that has different defaults from our public
    API, torch.autograd.gradcheck?

Future:

  • The future plan for this is to repeat the above for test_nn.py (the
    autogenerated test will require a denylist)
  • Finally, we can repeat the above for all pytorch test files that use
    gradcheck.

Test Plan:

  • run tests

Differential Revision: D25563542

This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

[ghstack-poisoned]
@zou3519 zou3519 requested a review from albanD as a code owner December 9, 2020 22:24
zou3519 added a commit that referenced this pull request Dec 9, 2020
This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

ghstack-source-id: 63654d2
Pull Request resolved: #49120
@dr-ci
Copy link
Copy Markdown

dr-ci bot commented Dec 9, 2020

💊 CI failures summary and remediations

As of commit 27a735b (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_bionic_py3_8_gcc9_coverage_test2 (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jan 13 19:17:36 [E request_callback_no_python.cpp:645] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Jan 13 19:17:36 At:
Jan 13 19:17:36   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(120): serialize
Jan 13 19:17:36   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(172): serialize
Jan 13 19:17:36 
Jan 13 19:17:36 [E request_callback_no_python.cpp:645] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Jan 13 19:17:36 
Jan 13 19:17:36 At:
Jan 13 19:17:36   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(120): serialize
Jan 13 19:17:36   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(172): serialize
Jan 13 19:17:36 
Jan 13 19:17:36 [E request_callback_no_python.cpp:645] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Jan 13 19:17:36 
Jan 13 19:17:36 At:
Jan 13 19:17:36   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(120): serialize
Jan 13 19:17:36   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(172): serialize
Jan 13 19:17:36 
Jan 13 19:17:37 ok (2.451s)
Jan 13 19:17:39   test_return_future_remote (__main__.ProcessGroupRpcTestWithSpawn) ... RPC was initialized with the PROCESS_GROUP backend which is deprecated and slated to be removed and superseded by the TENSORPIPE backend. It is recommended to migrate to the TENSORPIPE backend.
Jan 13 19:17:39 RPC was initialized with the PROCESS_GROUP backend which is deprecated and slated to be removed and superseded by the TENSORPIPE backend. It is recommended to migrate to the TENSORPIPE backend.
Jan 13 19:17:39 RPC was initialized with the PROCESS_GROUP backend which is deprecated and slated to be removed and superseded by the TENSORPIPE backend. It is recommended to migrate to the TENSORPIPE backend.
Jan 13 19:17:39 RPC was initialized with the PROCESS_GROUP backend which is deprecated and slated to be removed and superseded by the TENSORPIPE backend. It is recommended to migrate to the TENSORPIPE backend.

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

…autograd"

This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

[ghstack-poisoned]
zou3519 added a commit that referenced this pull request Dec 9, 2020
This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

ghstack-source-id: ae84bd5
Pull Request resolved: #49120
…autograd"

This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

[ghstack-poisoned]
…autograd"

This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

[ghstack-poisoned]
…autograd"

This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

[ghstack-poisoned]
…autograd"

This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

[ghstack-poisoned]
@zou3519 zou3519 requested a review from ezyang December 11, 2020 18:58
…autograd"

This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

[ghstack-poisoned]
zou3519 added a commit that referenced this pull request Dec 11, 2020
This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

ghstack-source-id: 280190d
Pull Request resolved: #49120
@ezyang ezyang requested a review from mruberry December 14, 2020 16:39
@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Dec 14, 2020

Added @mruberry as this is touching common_method_invocations

grad_outputs = [torch.randn_like(output) for _ in range(2)]

expected = [vjp(gO) for gO in grad_outputs]
expected = [torch.stack(shards) for shards in zip(*expected)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really a gradcheck right? A gradcheck is defined by comparing the analytic and numeric result. What you're doing here is verifying the functoriality of vmap over vjap.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(There's nothing wrong with this test, but not sure if it makes sense as an add on to gradcheck in this case.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. I can add a comment to this describing this.

One alternative to the approach here is to compute the full analytical jacobian via vmap. The problem I have with that is performance: it seems like it would do a bit more work than what is currently implemented in this PR. However, benchmarking shows that it's not that much more work: 191m (cpu time) vs 195m (cpu time, this PR) vs 198m (cpu time, if we computed the full jacobian)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do agree that this testing strategy is better. My point is mostly an organizational one.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an ideal world I would separate out the "batched grad" check from gradcheck and have separate tests for them. In PyTorch today this is a little difficult without writing a lot of code: we use gradcheck all over the place, but our only "structured" testing happens in test_nn (in the ModuleTests) and test_autograd (with common_method_invocations).
These two sources (test_nn and the common_method_invocations table, and soon to be OpInfo) are not comprehensive and we end up manually invoking gradcheck across our tests.

So an alternative could be to put the batched grad testing into its own helper function check_vmap_over_vjp, hunt down all of the callsites of gradcheck, and call check_vmap_over_vjp right after said gradchecks. There's around 430 places where this happens which will take me a while to hunt down, but on the upside it gives the right organizational structure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now keeping the organization this way makes it so that it is easier to test things; I can see splitting the functionality up as a way to implement OpInfo based tests. I'll add a comment into the code saying something about this.


inp = torch.randn(5, 5, requires_grad=True)
gradcheck(lambda x: NonDetFunc.apply(x, 0.0), inp)
gradcheck(lambda x: NonDetFunc.apply(x, 0.0), inp, check_batched_grad=False)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, didn't know this works.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works in what sense? That we have a proper test for non-determinism in the backward?

def run_grad_and_gradgrad_checks(test_case, name, test_name, apply_method, output_variable,
input_variables, run_gradgradcheck=True):
test_case.assertTrue(gradcheck(apply_method, input_variables, eps=1e-6, atol=PRECISION))
input_variables, run_gradgradcheck=True, check_batched_grad=True):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests using method_tests() are being deprecated in favor of tests using OpInfos like these:

class TestGradients(TestCase):

I'm developing a post on this now, but the plan is to remove method_tests(). This PR doesn't have to update the OpInfo-based testing but that should happen in the near future to continue testing this property. Instead of a custom EXCLUDE_BATCHED_GRAD_TESTS a property could be added to the OpInfo class for whether to run batched grad tests or not, or the OpInfo skip mechanism can be reused.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to add something to the OpInfo class in a followup to make sure these are tested in the future.

PRECISION = 1e-4

gradcheck = partial(gradcheck, check_batched_grad=True)
gradgradcheck = partial(gradgradcheck, check_batched_grad=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to complain about how this is very unexpected but then I went and looked to see how many occurrences of gradcheck there are in this file and... well... there are a lot.

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Dec 14, 2020

Going to give mruberry time to look before approving or not. This basically LGTM though.

@mruberry
Copy link
Copy Markdown
Collaborator

Going to give mruberry time to look before approving or not. This basically LGTM though.

No worries on updating this test, it is still heavily used at the moment. But the OpInfo-based tests will need to be updated, too. It's OK if that update happens in a future PR.

zou3519 added a commit that referenced this pull request Dec 15, 2020
This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

ghstack-source-id: 92f18ee
Pull Request resolved: #49120
@zou3519 zou3519 requested review from ezyang and mruberry December 15, 2020 22:13
@mruberry
Copy link
Copy Markdown
Collaborator

I'll let @albanD and @ezyang comment on the code. My question is: how much do we expect test time to change with this diff?

@zou3519
Copy link
Copy Markdown
Contributor Author

zou3519 commented Dec 17, 2020

@mruberry, not very much. On my (very noisy) machine:

Before:
real    8m3.991s
user    191m13.005s
sys     0m59.089s

After:
real    8m18.060s
user    195m27.363s
sys     1m5.554s

@mruberry
Copy link
Copy Markdown
Collaborator

@mruberry, not very much. On my (very noisy) machine:

Before:
real    8m3.991s
user    191m13.005s
sys     0m59.089s

After:
real    8m18.060s
user    195m27.363s
sys     1m5.554s

I also took a look at the timings on the CI machines and it seems like a minimal difference.

@zou3519 zou3519 requested review from albanD and removed request for albanD, ezyang and mruberry January 7, 2021 16:08
…autograd"

This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

Differential Revision: [D25563542](https://our.internmc.facebook.com/intern/diff/D25563542)

[ghstack-poisoned]
zou3519 added a commit that referenced this pull request Jan 7, 2021
This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

ghstack-source-id: 2a7e02e
Pull Request resolved: #49120
…autograd"

This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

Differential Revision: [D25563542](https://our.internmc.facebook.com/intern/diff/D25563542)

[ghstack-poisoned]
zou3519 added a commit that referenced this pull request Jan 13, 2021
This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

ghstack-source-id: 76eba0e
Pull Request resolved: #49120
Copy link
Copy Markdown
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for the update!

return not_reentrant_error(' (calculated using complex valued grad output)')

if check_batched_grad:
assert reentrant, ('Batched gradient checking makes the assumption that '
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit I would mention that we expect gradcheck to have early exited before reaching this point when not re-entrant.

…autograd"

This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

Differential Revision: [D25563542](https://our.internmc.facebook.com/intern/diff/D25563542)

[ghstack-poisoned]
…autograd"

This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

Differential Revision: [D25563542](https://our.internmc.facebook.com/intern/diff/D25563542)

[ghstack-poisoned]
zou3519 added a commit that referenced this pull request Jan 13, 2021
This adds a `check_batched_grad=False` option to gradcheck and gradgradcheck.
It defaults to False because gradcheck is a public API and I don't want
to break any existing non-pytorch users of gradcheck.
This:
- runs grad twice with two grad outputs, a & b
- runs a vmapped grad with torch.stack([a, b])
- compares the results of the above against each other.

Furthermore:
- `check_batched_grad=True` is set to be the default for
gradcheck/gradgradcheck inside of test_autograd.py. This is done by
reassigning to the gradcheck object inside test_autograd
- I manually added `check_batched_grad=False` to gradcheck instances
that don't support batched grad.
- I added a denylist for operations that don't support batched grad.

Question:
- Should we have a testing only gradcheck (e.g.,
torch.testing.gradcheck) that has different defaults from our public
API, torch.autograd.gradcheck?

Future:
- The future plan for this is to repeat the above for test_nn.py (the
autogenerated test will require a denylist)
- Finally, we can repeat the above for all pytorch test files that use
gradcheck.

Test Plan:
- run tests

ghstack-source-id: 3b5076b
Pull Request resolved: #49120
@facebook-github-bot
Copy link
Copy Markdown
Contributor

@zou3519 merged this pull request in 443412e.

@malfet
Copy link
Copy Markdown
Contributor

malfet commented Jan 15, 2021

Reverting this change as it broke TestAutograd.test_lobpcg:

$ PYTORCH_TEST_WITH_SLOW=1 python test_autograd.py -v TestAutograd.test_lobpcg
test_lobpcg (__main__.TestAutograd) ... ERROR

======================================================================
ERROR: test_lobpcg (__main__.TestAutograd)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/nshulga/git/pytorch/torch/testing/_internal/common_utils.py", line 566, in wrapper
    fn(*args, **kwargs)
  File "/home/nshulga/git/pytorch/torch/testing/_internal/common_utils.py", line 526, in wrapper
    fn(*args, **kwargs)
  File "test_autograd.py", line 2851, in test_lobpcg
    run_symeig_test(1, (6, 6), largest=largest)
  File "test_autograd.py", line 2833, in run_symeig_test
    gradcheck(lambda A: func(k, A, largest), A)
  File "/home/nshulga/git/pytorch/torch/autograd/gradcheck.py", line 473, in gradcheck
    test_batched_grad(fail_test, tupled_inputs, o, j)
  File "/home/nshulga/git/pytorch/torch/autograd/gradcheck.py", line 253, in test_batched_grad
    result = vmap(vjp)(torch.stack(grad_outputs))
  File "/home/nshulga/git/pytorch/torch/_vmap_internals.py", line 250, in wrapped
    batched_outputs = func(*batched_inputs)
  File "/home/nshulga/git/pytorch/torch/autograd/gradcheck.py", line 238, in vjp
    results = grad(v)
  File "/home/nshulga/git/pytorch/torch/autograd/__init__.py", line 225, in grad
    inputs, allow_unused, accumulate_grad=False)
  File "/home/nshulga/git/pytorch/torch/autograd/function.py", line 89, in apply
    return self._forward_cls.backward(self, *args)  # type: ignore
  File "/home/nshulga/git/pytorch/torch/_lobpcg.py", line 328, in backward
    D_grad, U_grad, A, D, U, largest
  File "/home/nshulga/git/pytorch/torch/_lobpcg.py", line 259, in _symeig_backward
    D_grad, U_grad, A, D, U, largest
  File "/home/nshulga/git/pytorch/torch/_lobpcg.py", line 159, in _symeig_backward_partial_eigenspace
    generator=gen
RuntimeError: vmap: We do not yet support calling random operations inside of vmap. Please perform random operations outside of vmap as a workaround

----------------------------------------------------------------------
Ran 1 test in 0.440s

FAILED (errors=1)

@facebook-github-bot
Copy link
Copy Markdown
Contributor

This pull request has been reverted by 9efe153.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants