Skip to content

copy.deepcopy does not copy gradient buffers of torch.autograd.Variable instance #3307

@geoffreyroeder

Description

@geoffreyroeder

I ran into unexpected behaviour with copy.deepcopy applied to a Variable. The gradient buffer of the Variable is not copied.

a = torch.autograd.Variable(torch.ones(1))
a.grad = torch.autograd.Variable(torch.ones(1))
b = copy.deepcopy(a)
print(b.grad)

I think it would be a good idea to copy the gradient buffer during a deep copy. My use case is recording the gradient of a model's parameter space for optimization research. This would also be useful for debugging/development of complex models that involve atypical gradient operations.

This is handled here:

def __deepcopy__(self, memo):
if not self.is_leaf:
raise RuntimeError("Only Variables created explicitly by the user "
"(graph leaves) support the deepcopy protocol at the moment")
result = type(self)(self.data.clone())
result.requires_grad = self.requires_grad
result.volatile = self.volatile
memo[id(self)] = result
return result

A solution would be to also copy the grad attribute of the current Variable, which would involve a recursion of the deep copy since the grad attribute is also a Variable.

cc @ezyang @gchanan @zou3519 @bdhirsh @jbschlosser @albanD @gqchen @pearu @nikitaved @soulitzer @ssnl

Metadata

Metadata

Assignees

Labels

high prioritymodule: autogradRelated to torch.autograd, and the autograd engine in generalmodule: bc-breakingRelated to a BC-breaking changetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions