Reproduce in version 0.2.0+1199e3d. Memory keeps growing at every iteration.
import torch
from torch.autograd import grad, Variable
from torchvision import models
model = models.resnet50().cuda()
for k in range(20):
x = Variable(torch.rand(8, 3, 224, 224).cuda(), requires_grad=True)
dx, = grad(model(x).sum(), x, create_graph=True)
y = model(x + dx).sum()
y.backward()
Reproduce in version 0.2.0+1199e3d. Memory keeps growing at every iteration.