Replace Variable.volatile with torch.no_grad() by colesbury · Pull Request #3970 · pytorch/pytorch

colesbury · 2017-12-01T19:18:16Z

This removes volatile from Variable. The functionality is mostly
replaced by a global (thread-local) flag, which is controlled by
torch.set_grad_enabled() and the context manager torch.no_grad().

In C++, the flag is exposed through GradMode::is_enabled() and GradMode::set_enabled()

Fixes #3627

pytorchbot · 2017-12-01T19:18:17Z

@colesbury, thanks for your PR! We identified @zdevito to be a potential reviewer.

colesbury · 2017-12-01T19:19:42Z

I placed the context manage in the torch package instead of torch.autograd in anticipation of merging Tensor and Variable.

apaszke · 2017-12-02T00:43:07Z

Haven't started the review, but can we please call it no_grad or sth like this? Backprop is very nn-specific and is a special case of a more general reverse-mode AD, which is the level on which autograd operates.

Gradients were becoming non-volatile because at::zeros_like returned a Variable with volatile always set to false. The non-volatile gradients accumulated history in the model which results in continuously increasing memory usage, See pytorch#3983, pytorch#3835, pytorch#3824 In v0.4 this will be more robustly solved by pytorch#3970

Gradients were becoming non-volatile because at::zeros_like returned a Variable with volatile always set to false. The non-volatile gradients accumulated history in the model which results in continuously increasing memory usage, See #3983, #3835, #3824 In v0.4 this will be more robustly solved by #3970

Gradients were becoming non-volatile because at::zeros_like returned a Variable with volatile always set to false. The non-volatile gradients accumulated history in the model which results in continuously increasing memory usage, See pytorch#3983, pytorch#3835, pytorch#3824 In v0.4 this will be more robustly solved by pytorch#3970

Gradients were becoming non-volatile because at::zeros_like returned a Variable with volatile always set to false. The non-volatile gradients accumulated history in the model which results in continuously increasing memory usage, See #3983, #3835, #3824 In v0.4 this will be more robustly solved by #3970

Randl · 2017-12-13T07:43:53Z

ping

apaszke

Looks great. I have a ton of questions, because I want to make sure that everything works fine, and we haven't missed anything. Also, I think there are a few things that should be fixed before we merge this.

Sign in to view

apaszke

Can you please add a test that checks that views of a detached base still require grad and have grad functions?

Sign in to view

This removes volatile from Variable. The functionality is mostly replaced by a global (thread-local) flag, which is controlled by torch.set_backprop_enabled() and the context manager torch.no_backprop(). Fixes pytorch#3627

…unter

rebase_history also now immediately updates view._grad_fn.

Sign in to view

+  base.output_nr() = 0;
+  base.get()->_grad_fn = std::make_shared<CopySlices>(
+      base, TensorGeometry(data), std::move(grad_fn));
+  get_grad_fn();  // trigger an update to the view's grad_fn


rawalkhirodkar · 2018-01-21T20:00:06Z

Can we update the documentation highlighting the removal of volatile in favor of torch.no_grad( )?
Thank you

apaszke · 2018-01-21T20:53:59Z

@rawalkhirodkar yes, that's definitely needed. Can you please send a PR?

Randl · 2018-01-21T21:29:33Z

Also I'm not sure what to do if only part of variables don't require grads

apaszke · 2018-01-21T22:28:59Z

@Randl requires_grad still works as it used to (when grad mode is enabled)

Randl · 2018-01-22T08:39:31Z

@apaszke I probably didn't explain myself well.

As far as I understand, with torch.no_grad() is equivalent to running with all variables volatile. What if want only one variable to be volatile?

apaszke · 2018-01-22T11:16:02Z

Then you need to separate the volatile codepaths from the other ones. I haven't yet seen a case where keeping only some Variables volatile is useful.

fmassa · 2018-01-22T13:48:04Z

@Randl also note that volatile variables propagate in the graph. So if you have 2 variables, a and b, where only a is volatile, the result of an operation between a and b will be volatile as well.

my-hello-world · 2018-08-31T08:10:56Z

hello,i want to know that if PyTorch 0.4 can with CUDA7.5 support.cause when i install cuda8.0,it show:" Found GPU1 Quadro K2000 which is of cuda capability 3.0.
PyTorch no longer supports this GPU because it is too old."
i sorry that i can't afford the new Nvidia.
now, i install cuda7.5,and when i try "export CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" # [anaconda root directory]
conda install numpy pyyaml mkl mkl-include setuptools cmake cffi typing
conda install -c mingfeima mkldnn
conda install -c pytorch magma-cuda75
git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
python setup.py install" the pytorch -V is 0.3.0
so how can i do?
My environment is : ubuntu14.04,anaconda, python3.6.5
Thanks!!!!!

Gradients were becoming non-volatile because at::zeros_like returned a Variable with volatile always set to false. The non-volatile gradients accumulated history in the model which results in continuously increasing memory usage, See pytorch#3983, pytorch#3835, pytorch#3824 In v0.4 this will be more robustly solved by pytorch#3970

This removes volatile from Variable. The functionality is mostly replaced by a global (thread-local) flag, which is controlled by torch.set_grad_enabled() and the context manager torch.no_grad(). In C++, the flag is exposed through GradMode::is_enabled() and GradMode::set_enabled() Fixes pytorch#3627

colesbury requested review from apaszke, ezyang and zdevito December 1, 2017 19:18

colesbury force-pushed the backprop_mode branch from f819873 to 7dcd69a Compare December 1, 2017 19:50

colesbury mentioned this pull request Dec 1, 2017

Grad backward memory leak? #3824

Closed

colesbury mentioned this pull request Dec 2, 2017

[v0.3] Propagate volatile in zeros_like #3984

Merged

apaszke reviewed Dec 13, 2017

View reviewed changes

colesbury force-pushed the backprop_mode branch 2 times, most recently from bc4bd95 to 4968993 Compare December 13, 2017 23:36

colesbury changed the title ~~Replace Variable.volatile with torch.no_backprop()~~ Replace Variable.volatile with torch.no_grad() Dec 13, 2017

apaszke reviewed Dec 14, 2017

View reviewed changes

apaszke reviewed Dec 15, 2017

View reviewed changes

colesbury added 9 commits December 18, 2017 10:19

Replace Variable.volatile with torch.no_backprop()

9cbda80

This removes volatile from Variable. The functionality is mostly replaced by a global (thread-local) flag, which is controlled by torch.set_backprop_enabled() and the context manager torch.no_backprop(). Fixes pytorch#3627

Fix optim tests and type check in THPVariable_set_grad

9500ec3

Changes from review

2723dda

Rename no_backprop to no_grad

3037749

Remove references to volatile

ddab78c

Test that in-place ops in no-grad mode still increment the version co…

b8c7c07

…unter

Only call rebase_history if grad_fn is not null

b58c3e2

rebase_history also now immediately updates view._grad_fn.

Changes from review

ba58158

Push grad_enabled down into IODescriptors

1efda85

colesbury force-pushed the backprop_mode branch from d5fa2c1 to 1efda85 Compare December 18, 2017 19:06

onnxbot mentioned this pull request Dec 18, 2017

[auto] pytorch-pr-3970 onnxbot/onnx-fb-universe#65

Closed

colesbury added 2 commits December 18, 2017 11:10

Add test for detaching base

67bf44b

Fix typo

5aa6a41

colesbury merged commit d605058 into pytorch:master Dec 18, 2017

colesbury deleted the backprop_mode branch December 18, 2017 20:46

Randl mentioned this pull request Dec 19, 2017

Leaking dataloader #3882

Closed

apaszke reviewed Dec 20, 2017

View reviewed changes

MattKleinsmith mentioned this pull request Jan 13, 2018

error in line 86 of Trainer.py MattKleinsmith/pbt#1

Closed

ngimel mentioned this pull request Feb 10, 2018

Unnecessary memcopies emitted by autograd engine #5169

Closed

li-roy mentioned this pull request Jun 28, 2018

ConvNd function leaks memory #3835

Closed

Conversation

colesbury commented Dec 1, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorchbot commented Dec 1, 2017

Uh oh!

colesbury commented Dec 1, 2017

Uh oh!

apaszke commented Dec 2, 2017

Uh oh!

Randl commented Dec 13, 2017

Uh oh!

apaszke left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

apaszke left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

colesbury commented Dec 1, 2017 •

edited

Loading