BicycleGAN icon indicating copy to clipboard operation
BicycleGAN copied to clipboard

Compute graph wrong and one question

Open lswzjuer opened this issue 5 years ago • 4 comments

''' ../BycycleGAN/models/bicycle_gan_model.py, line 188, in backward_G_alone self.loss_z_L1.backword() RuntimeError: one of the variables needed for gradient compution has been modified by an inpalace operation:[torch.cuda.FloatTensor [258,8]],which is output 0 pf TBackward, is at version 2; expected version 1 instaed. '''

1: When I run the script "train_edages2shoes.sh", I have encountered the error above and it seems that there is something wrong with the calculation diagram you defined. Note: I did not make any changes to the scripts in the Models folder. 2: when set the "self.opt.conditional_D " is True, I want to know why you chose to use (self.real_a_encoded, self.fake_b_random) to build fake_data_random instead of (self.real_a_random, self.fake_b_random).
` # generate fake_B_random self.fake_B_random = self.netG(self.real_A_encoded, self.z_random)

    if self.opt.conditional_D:   # tedious conditoinal data  input_nc+output_nc
        self.fake_data_encoded = torch.cat([self.real_A_encoded, self.fake_B_encoded], 1)
        self.real_data_encoded = torch.cat([self.real_A_encoded, self.real_B_encoded], 1)

        self.fake_data_random = torch.cat([**self.real_A_encoded**, self.fake_B_random], 1)
        self.real_data_random = torch.cat([self.real_A_random, self.real_B_random], 1)
    else:
        self.fake_data_encoded = self.fake_B_encoded
        self.fake_data_random = self.fake_B_random
        self.real_data_encoded = self.real_B_encoded
        self.real_data_random = **self.real_B_random**

`

lswzjuer avatar Jun 22 '20 14:06 lswzjuer

I haven't seen the errors in 1 before and I am not sure about what happened. For your question 2, note that

self.fake_B_random = self.netG(self.real_A_encoded, self.z_random)

fake_B_random is also conditional on real_A_encoded. The confusion might be caused by the naming. See #31 for more details.

junyanz avatar Jun 26 '20 18:06 junyanz

I was able to reproduce your error 1 now. It didn't happen for the previous PyTorch version. I fixed it with the latest commit.

junyanz avatar Jun 28 '20 00:06 junyanz

Hi @junyanz. I believe that your latest commit introduced an error.

Now this comment is no longer valid. Because self.backward_G_alone() also computes gradients for the encoder. You must keep the encoder fixed when minimizing latent regression loss (according to your paper).

TropComplique avatar Jun 28 '20 14:06 TropComplique

Yes. You are correct. I updated the code with a new commit. I set self.set_requires_grad([self.netE], False) before self.backward_G_alone(). I assume this will disabling computing graidents for E while we backprop the gradients from loss_z_L1. I am also not sure why the previous code doesn't work in the recent PyTorch version ( I am using 1.5.0) @SsnL

junyanz avatar Jun 28 '20 23:06 junyanz