hi,
pyro=0.1.2
pytorch=0.4.0a0+5eefe87
Running DMM on the dev branch I got the below logs:
aconda3/envs/nova-cpu-bedge/lib/python3.5/site-packages/pyro_ppl-0.1.2-py3.5.egg/pyro/poutine/trace.py:17: UserWarning: Encountered NAN log_pdf at site 'z_89'
warnings.warn("Encountered NAN log_pdf at site '{}'".format(name))
/data/sls/u/sameerk/anaconda3/envs/nova-cpu-bedge/lib/python3.5/site-packages/pyro_ppl-0.1.2-py3.5.egg/pyro/poutine/trace.py:17: UserWarning: Encountered NAN log_pdf at site 'z_90'
warnings.warn("Encountered NAN log_pdf at site '{}'".format(name))
/data/sls/u/sameerk/anaconda3/envs/nova-cpu-bedge/lib/python3.5/site-packages/pyro_ppl-0.1.2-py3.5.egg/pyro/poutine/trace.py:17: UserWarning: Encountered NAN log_pdf at site 'z_97'
warnings.warn("Encountered NAN log_pdf at site '{}'".format(name))
/data/sls/u/sameerk/anaconda3/envs/nova-cpu-bedge/lib/python3.5/site-packages/pyro_ppl-0.1.2-py3.5.egg/pyro/poutine/trace.py:17: UserWarning: Encountered NAN log_pdf at site 'z_99'
warnings.warn("Encountered NAN log_pdf at site '{}'".format(name))
/data/sls/u/sameerk/anaconda3/envs/nova-cpu-bedge/lib/python3.5/site-packages/pyro_ppl-0.1.2-py3.5.egg/pyro/poutine/trace.py:17: UserWarning: Encountered NAN log_pdf at site 'z_91'
warnings.warn("Encountered NAN log_pdf at site '{}'".format(name))
/data/sls/u/sameerk/anacond......
Then
[training epoch 0001] nan (dt = 8.369 sec)
[training epoch 0002] nan (dt = 8.041 sec)
[training epoch 0003] nan (dt = 8.240 sec)
[training epoch 0004] nan (dt = 8.217 sec)
[training epoch 0005] nan (dt = 8.588 sec)
[training epoch 0006] nan (dt = 8.512 sec)
[training epoch 0007] nan (dt = 8.644 sec)
[training epoch 0008] nan (dt = 8.434 sec)
[training epoch 0009] nan (dt = 9.041 sec)
[training epoch 0010] nan (dt = 8.911 sec)
I stopped the program and Ran it again, did not face the same issue:
Namespace(annealing_epochs=1000, beta1=0.96, beta2=0.999, checkpoint_freq=0, clip_norm=20.0, cuda=False, iaf_dim=100, learning_rate=0.0004, load_model='', load_opt='', log='dmm.log', lr_decay=0.99996, mini_batch_size=20, minimum_annealing_factor=0.1, num_epochs=5000, num_iafs=0, rnn_dropout_rate=0.1, save_model='', save_opt='', weight_decay=0.6)
N_train_data: 229 avg. training seq. length: 60.29 N_mini_batches: 12
[training epoch 0000] 61.9873 (dt = 9.196 sec)
[training epoch 0001] 51.8227 (dt = 8.735 sec)
[training epoch 0002] 25.2403 (dt = 9.289 sec)
[training epoch 0003] 16.1608 (dt = 8.802 sec)
[training epoch 0004] 14.1899 (dt = 9.282 sec)
PS: I was getting the following error when running DMM:
Traceback (most recent call last):
File "dmm.py", line 430, in <module>
main(args)
File "dmm.py", line 390, in main
epoch_nll += process_minibatch(epoch, which_mini_batch, shuffled_indices)
File "dmm.py", line 350, in process_minibatch
mini_batch_seq_lengths, annealing_factor)
File "/data/sls/u/sameerk/anaconda3/envs/nova-cpu-bedge/lib/python3.5/site-packages/pyro_ppl-0.1.2-py3.5.egg/pyro/infer/svi.py", line 97, in step
loss = self.loss_and_grads(self.model, self.guide, *args, **kwargs)
File "/data/sls/u/sameerk/anaconda3/envs/nova-cpu-bedge/lib/python3.5/site-packages/pyro_ppl-0.1.2-py3.5.egg/pyro/infer/trace_elbo.py", line 133, in loss_and_grads
for weight, model_trace, guide_trace, log_r in self._get_traces(model, guide, *args, **kwargs):
File "/data/sls/u/sameerk/anaconda3/envs/nova-cpu-bedge/lib/python3.5/site-packages/pyro_ppl-0.1.2-py3.5.egg/pyro/infer/trace_elbo.py", line 74, in _get_traces
guide_trace = poutine.trace(guide).get_trace(*args, **kwargs)
File "/data/sls/u/sameerk/anaconda3/envs/nova-cpu-bedge/lib/python3.5/site-packages/pyro_ppl-0.1.2-py3.5.egg/pyro/poutine/trace_poutine.py", line 250, in get_trace
self(*args, **kwargs)
File "/data/sls/u/sameerk/anaconda3/envs/nova-cpu-bedge/lib/python3.5/site-packages/pyro_ppl-0.1.2-py3.5.egg/pyro/poutine/trace_poutine.py", line 238, in __call__
ret = super(TracePoutine, self).__call__(*args, **kwargs)
File "/data/sls/u/sameerk/anaconda3/envs/nova-cpu-bedge/lib/python3.5/site-packages/pyro_ppl-0.1.2-py3.5.egg/pyro/poutine/poutine.py", line 147, in __call__
return self.fn(*args, **kwargs)
File "dmm.py", line 226, in guide
rnn_output = poly.pad_and_reverse(rnn_output, mini_batch_seq_lengths)
File "/data/sls/u/sameerk/repos/pyro/examples/dmm/polyphonic_data_loader.py", line 88, in pad_and_reverse
reversed_output = reverse_sequences_torch(rnn_output, seq_lengths)
File "/data/sls/u/sameerk/repos/pyro/examples/dmm/polyphonic_data_loader.py", line 78, in reverse_sequences_torch
else Variable(torch.LongTensor(time_slice))
RuntimeError: tried to construct a tensor from a int sequence, but found an item of type numpy.int64 at index (0)
I fixed it by adding the line:
time_slice = time_slice.tolist() in polyphonic_data_loader.py
Any comments on this behaviour?
hi,
pyro=0.1.2
pytorch=0.4.0a0+5eefe87
Running DMM on the dev branch I got the below logs:
Then
I stopped the program and Ran it again, did not face the same issue:
PS: I was getting the following error when running DMM:
I fixed it by adding the line:
time_slice = time_slice.tolist()inpolyphonic_data_loader.pyAny comments on this behaviour?